强化学习
维数之咒
可扩展性
计算机科学
利用
分布式计算
钥匙(锁)
状态空间
比例(比率)
人工智能
计算机安全
数学
量子力学
数据库
统计
物理
作者
Guannan Qu,Adam Wierman,Na Li
出处
期刊:Operations Research
[Institute for Operations Research and the Management Sciences]
日期:2022-02-23
卷期号:70 (6): 3601-3628
被引量:12
标识
DOI:10.1287/opre.2021.2226
摘要
Highlighted by success stories like AlphaGo, reinforcement learning (RL) has emerged as a powerful tool for decision making in complex environments. However, the success of RL has thus far been limited to small-scale or single-agent systems. To apply RL to large-scale networked systems such as energy, transportation, and communication networks, a critical hurdle is the curse of dimensionality, because for these systems, the state and action space can be exponentially large in the number of nodes in the network. This article attempts to break this curse of dimensionality and designs a scalable RL method, named scalable actor critic (SAC), for large networked systems. The key technical contribution is to exploit the network structure to derive an exponential decay property, which enables the design of the SAC approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI