强化学习
无状态协议
计算机科学
社会困境
马尔可夫决策过程
马尔可夫链
人工智能
等级制度
机器学习
马尔可夫过程
算法
国家(计算机科学)
心理学
社会心理学
数学
市场经济
统计
经济
作者
Ting Guo,Yuyu Yuan,Pengqian Zhao
出处
期刊:Applied sciences
[Multidisciplinary Digital Publishing Institute]
日期:2023-01-31
卷期号:13 (3): 1807-1807
被引量:8
摘要
Recently, the social dilemma problem is no longer limited to unrealistic stateless matrix games but has been extended to temporally and spatially extended Markov games by multi-agent reinforcement learning. Many multi-agent reinforcement-learning algorithms have been proposed to solve sequential social dilemmas. However, most current algorithms focus on cooperation to improve the overall reward while ignoring the equality among agents, which could be improved in terms of practicality. Here, we propose a novel admission-based hierarchical multi-agent reinforcement-learning algorithm to promote cooperation and equality among agents. We extend the give-or-take-some model to Markov games, decompose the fairness of each agent, and propose an Admission reward. For better learning, we design a hierarchy consisting of a high-level policy and multiple low-level policies, where the high-level policy maximizes the Admission reward by choosing different low-level policies to interact with environments. In addition, the learning and execution of policies are realized through a decentralized method. We conduct experiments in multiple sequential social dilemmas environments and show that the Admission algorithm significantly outperforms the baselines, demonstrating that our algorithm can learn cooperation and equality well.
科研通智能强力驱动
Strongly Powered by AbleSci AI