离散时间和连续时间
控制理论(社会学)
有界函数
最优控制
动态规划
计算机科学
非线性系统
人工神经网络
贝尔曼方程
仿射变换
控制器(灌溉)
李雅普诺夫函数
趋同(经济学)
数学优化
数学
控制(管理)
人工智能
物理
经济
生物
数学分析
农学
量子力学
纯数学
经济增长
统计
作者
Travis Dierks,S. Jagannathan
标识
DOI:10.1109/tnnls.2012.2196708
摘要
In this paper, the Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations. The proposed approach, referred to as adaptive dynamic programming, uses two neural networks (NNs), to solve the infinite horizon optimal regulation control of affine nonlinear discrete-time systems in the presence of unknown internal dynamics and a known control coefficient matrix. One NN approximates the cost function and is referred to as the critic NN, while the second NN generates the control input and is referred to as the action NN. The cost function and policy are updated once at the sampling instant and thus the proposed approach can be referred to as time-based ADP. Novel update laws for tuning the unknown weights of the NNs online are derived. Lyapunov techniques are used to show that all signals are uniformly ultimately bounded and that the approximated control signal approaches the optimal control input with small bounded error over time. In the absence of disturbances, an optimal control is demonstrated. Simulation results are included to show the effectiveness of the approach. The end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation.
科研通智能强力驱动
Strongly Powered by AbleSci AI