强化学习
最优控制
有界函数
零(语言学)
计算机科学
数学优化
班级(哲学)
控制(管理)
零和博弈
非线性系统
过程(计算)
多项式的
数学
功能(生物学)
控制理论(社会学)
人工智能
纳什均衡
数学分析
语言学
哲学
物理
量子力学
进化生物学
生物
操作系统
作者
Ruizhuo Song,Gaofu Yang,Frank L. Lewis
标识
DOI:10.1109/tnnls.2022.3191847
摘要
In this article, we solve a class of mixed zero-sum game with unknown dynamic information of nonlinear system. A policy iterative algorithm that adopts integral reinforcement learning (IRL), which does not depend on system information, is proposed to obtain the optimal control of competitor and collaborators. An adaptive update law that combines critic-actor structure with experience replay is proposed. The actor function not only approximates optimal control of every player but also estimates auxiliary control, which does not participate in the actual control process and only exists in theory. The parameters of the actor-critic structure are simultaneously updated. Then, it is proven that the parameter errors of the polynomial approximation are uniformly ultimately bounded. Finally, the effectiveness of the proposed algorithm is verified by two given simulations.
科研通智能强力驱动
Strongly Powered by AbleSci AI