汉密尔顿-雅各比-贝尔曼方程
强化学习
标识符
非线性系统
计算机科学
最优控制
数学优化
控制理论(社会学)
等价(形式语言)
梯度下降
残余物
雅可比矩阵与行列式
多智能体系统
数学
人工神经网络
人工智能
应用数学
控制(管理)
算法
物理
离散数学
量子力学
程序设计语言
作者
Guoxing Wen,C. L. Philip Chen,Bin Li
标识
DOI:10.1109/tie.2019.2946545
摘要
The article proposes an optimized leader-follower formation control using a simplified reinforcement learning (RL) of identifier-critic-actor architecture for a class of nonlinear multiagent systems. In general, optimal control is expected to be obtained by solving Hamilton-Jacobi-Bellman (HJB) equation, but the equation associated with a nonlinear system is difficult to solve by analytical method. Although the difficulty can be effectively overcome by the RL strategy, the existing RL algorithms are very complex because their updating laws are obtained by carrying out gradient descent algorithm to square of the approximated HJB equation (Bellman residual error). For a multiagent system, due to the state coupling problem, these methods will become difficult implementation. In the proposed optimized scheme, the RL updating laws are derived from negative gradient of a simple positive function, which is the equivalence to HJB equation; therefore, the control algorithm is significantly simple. Furthermore, in order to solve the problem of unknown system dynamics, an adaptive identifier is integrated into the control. Finally, the theory and simulation demonstrate that the optimized formation scheme can guarantee the desired control performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI