非线性系统
动态规划
符号
趋同(经济学)
数学
零(语言学)
算法
功能(生物学)
摄动(天文学)
应用数学
数学优化
计算机科学
经济增长
量子力学
进化生物学
生物
算术
物理
哲学
语言学
经济
作者
Xiong Yang,Mengmeng Xu,Qinglai Wei
标识
DOI:10.1109/tsmc.2023.3247888
摘要
This article considers the $H_{\infty }$ control problem of nonlinear systems having unavailable dynamics and asymmetric saturating actuators. Initially, such an $H_{\infty }$ control problem is converted into the zero-sum game with a nonquadratic cost function being introduced. Then, in order to solve the Hamilton–Jacobi–Isaacs equation arising in the zero-sum game, a simultaneous policy iteration (SPI) algorithm is developed under the adaptive dynamic programming framework. Meanwhile, it is proved that the convergence of the SPI algorithm in essence amounts to the convergence of the sequential PI algorithm. To implement the SPI algorithm, the critic, the actor, and the perturbation neural networks (NNs) are, respectively, constructed to estimate the cost function, the control policy, and the perturbation. The three NNs' weights are simultaneously determined by using the least-squares method together with the Monte Carlo integration technique. A remarkable characteristic of such an SPI algorithm is that arbitrary control policies and perturbations are applicable in the learning process. This makes system's information be able to be replaced by the data collected along system's trajectories in advance. More importantly, the persistence of the excitation condition is not required. Finally, simulations of two nonlinear examples are given to validate the present SPI algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI