强化学习
趋同(经济学)
计算机科学
纳什均衡
数学优化
梯度下降
功能(生物学)
贝尔曼方程
理论(学习稳定性)
数学证明
稳定性理论
同步(交流)
多智能体系统
人工神经网络
控制(管理)
控制理论(社会学)
数学
非线性系统
人工智能
几何学
量子力学
计算机网络
经济
频道(广播)
物理
机器学习
经济增长
生物
进化生物学
作者
Kyriakos G. Vamvoudakis
摘要
Summary In this paper, we consider the problem of leader synchronization in systems with interacting agents in large networks while simultaneously satisfying energy‐related user‐defined distributed optimization criteria. But modeling in large networks is very difficult, and for that reason, we derive a model‐free formulation that is based on a separate distributed Q‐learning function for every agent. Every Q‐function is a parametrization of each agent's control, of the neighborhood controls, and of the neighborhood tracking error. It is also evident that none of the agents has any information on where the leader is connected to and from where she spreads the desired information. The proposed algorithm uses an integral reinforcement learning approach with a separate distributed actor/critic network for each agent: a critic approximator to approximate each value function and an actor approximator to approximate each optimal control law. The derived tuning laws for each actor and critic approximators are designed appropriately by using gradient descent laws. We provide rigorous stability and convergence proofs to show that the closed‐loop system has an asymptotically stable equilibrium point and that the control policies form a graphical Nash equilibrium. We demonstrate the effectiveness of the proposed method on a network consisting of 10 agents. Copyright © 2016 John Wiley & Sons, Ltd.
科研通智能强力驱动
Strongly Powered by AbleSci AI