强化学习
计算机科学
随机性
噪音(视频)
理论(学习稳定性)
人工智能
困境
机器学习
数学
几何学
统计
图像(数学)
作者
Zhitong Zhao,Ya Zhang,Siying Wang,Hong Qu
标识
DOI:10.1109/csis-iac60628.2023.10363845
摘要
Multi-agent reinforcement learning (MARL) has demonstrated to be an attractive approach for addressing the challenge of multi-agent collaboration. Numerous existing MARL algorithms require hand-craft setting of exploration parameters to alleviate instability and low efficiency in the exploration-exploitation dilemma. However, the meticulous hand-craft exploration parameters limited the efficiency and adaptability of algorithms. In this paper, we propose a learnable noise exploration method for multi-agent reinforcement learning. By introducing noise into the individual value network, the proposed method brings about a level of randomness in agents to drive exploration. The parameters of noise are updated along with the network through gradient descent, avoiding the handcraft setting of exploration parameters. The experiment results demonstrate the effectiveness of our method in the single-state matrix game, and show the stability and significant performance improvement in the predator-prey game and hard multi-agent particle environment.
科研通智能强力驱动
Strongly Powered by AbleSci AI