反推
控制理论(社会学)
非线性系统
汉密尔顿-雅各比-贝尔曼方程
计算机科学
强化学习
控制(管理)
数学优化
非线性控制
曲面(拓扑)
数学
最优控制
自适应控制
人工智能
物理
几何学
量子力学
作者
Guoxing Wen,Ranran Zhou,Yanlong Zhao,Ben Niu
标识
DOI:10.1109/tsmc.2024.3379356
摘要
In this article, for the single-input–single-output (SISO) nonlinear strict-feedback system, optimized backstepping (OB) control combined with the dynamic surface (DS) technique is developed. OB is to make every subsystem control of backstepping as the optimized one so as to ensure the entire backstepping control being optimized. However, the original design of OB still needs to repeatedly calculate the derivative of virtual controls, as a result, it will inevitably cause the problem of "differential explosion." In order to alleviate the phenomenon, the OB control is combined with the DS technique. Furthermore, OB control needs to conduct with reinforcement learning (RL) in every backstepping step, hence simplifying the algorithm of RL is very necessary and substantive for achieving the combination. In this work, because the optimized control derives both critic and actor training laws by utilizing a simple positive function instead of the square of approximation of Hamilton–Jacobi–Bellman (HJB) equation, it can obviously simplify the RL algorithm to compare with the traditional optimizing methods. Finally, the feasibility is illustrated via both theory and simulation.
科研通智能强力驱动
Strongly Powered by AbleSci AI