计算机科学
人工智能
可扩展性
注意力网络
编码
动作(物理)
序列(生物学)
过程(计算)
特征(语言学)
人工神经网络
任务(项目管理)
强化学习
机器学习
代表(政治)
操作系统
基因
经济
法学
政治学
生物
政治
数据库
量子力学
哲学
管理
语言学
遗传学
物理
化学
生物化学
作者
Dianxi Shi,Chenran Zhao,Yajie Wang,Huanhuan Yang,Gongju Wang,Hao Jiang,Chao Xue,Shaowu Yang,Yongjun Zhang
标识
DOI:10.1016/j.neucom.2021.10.093
摘要
Abstract Deep reinforcement learning has made significant progress in multi-agent tasks in recent years. However, most previous studies focus on solving fully cooperative tasks, which do not perform well in mixed tasks. In mixed tasks, the agent needs to comprehensively consider the information provided by its friends and enemies to learn its strategy, and its strategy is sensitive to the received information. Additionally, the input space of the critic network increases rapidly with the number of agents in the actor-critic framework. It’s of great necessity to efficiently learn information representation to obtain important features. To this end, we present an approach that conducts information representation with attention mechanism. Our approach adopts the framework of centralized training and decentralized execution. We apply the multi-head hierarchical attention mechanism to centrally computed critics, so critics can process the received information more accurately and assist actors in choosing better actions. The hierarchical attention critic adopts a bi-level attention structure which is composed of the agent-level and the group-level. They are designed to assign different weights to friends’ and enemies’ information and then summarize them at each timestep. It achieves high efficiency and scalability in mixed tasks. Furthermore, we use the feature extraction based on the recurrent neural network to encode the state-action sequence information of each agent. Experimental results show that our approach is not only applicable to cooperative environments but also better in mixed environments, especially in the predator-prey task, the reward obtained by our method is twice that of the baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI