强化学习
计算机科学
多智能体系统
人工智能
人机交互
分布式计算
作者
Hiroto Ebara,T. Nakamura,Akira Taniguchi,Tadahiro Taniguchi
标识
DOI:10.1109/iiai-aai-winter61682.2023.00073
摘要
This paper proposes an integrated model of multi-agent reinforcement learning with emergent communication based on probabilistic generative models called MASAC- ECo that enables two agents to learn cooperative actions. In this model, agents receive messages as discrete symbols to communicate the state of each agent based on the Metropolis-Hastings naming game (MHNG). Using MHNG, the messages can emerge with-out directly observing another agent's state, and the emerging messages enable each agent to know the other agent's state indirectly. Furthermore, each agent policy is learned using a soft actor-critic, and by utilizing the emerging message as an input of the soft actor-critic, each agent can learn cooperative actions depending on its state and message. In the experiment, we demonstrated that MASAC- ECo can learn cooperative actions. Moreover, the experimental results show that its performance is comparable with that of the conventional method, even though in the training phase, the latter can use the other's state directly, whereas MASAC-ECo cannot.
科研通智能强力驱动
Strongly Powered by AbleSci AI