强化学习
纳什均衡
趋同(经济学)
数理经济学
理论(学习稳定性)
捐赠
虚构的游戏
数学
人口
数学优化
领域(数学)
博弈论
计算机科学
人工智能
机器学习
经济
纯数学
人口学
社会学
经济增长
哲学
认识论
作者
Xin Guo,Anran Hu,Renyuan Xu,Junzi Zhang
标识
DOI:10.1287/moor.2022.1274
摘要
This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision making in stochastic games with a large population. It first establishes the existence of a unique Nash equilibrium to this GMFG, and it demonstrates that naively combining reinforcement learning with the fixed-point approach in classical mean-field games yields unstable algorithms. It then proposes value-based and policy-based reinforcement learning algorithms (GMF-V and GMF-P, respectively) with smoothed policies, with analysis of their convergence properties and computational complexities. Experiments on an equilibrium product pricing problem demonstrate that two specific instantiations of GMF-V with Q-learning and GMF-P with trust region policy optimization—GMF-V-Q and GMF-P-TRPO, respectively—are both efficient and robust in the GMFG setting. Moreover, their performance is superior in convergence speed, accuracy, and stability when compared with existing algorithms for multiagent reinforcement learning in the N-player setting. Funding: X. Guo acknowledges the generous support from Coleman Fung Chair endowment fund and the Tsinghua-Berkeley-Shenzhen-Institute.
科研通智能强力驱动
Strongly Powered by AbleSci AI