控制理论(社会学)
动态规划
计算机科学
观察员(物理)
控制器(灌溉)
噪音(视频)
贝尔曼方程
国家(计算机科学)
数学优化
非线性系统
最优控制
可观测性
功能(生物学)
残余物
马尔可夫决策过程
自适应控制
二次方程
期限(时间)
系统动力学
价值(数学)
控制(管理)
数学
动作(物理)
国家观察员
二次规划
线性二次调节器
动力系统理论
趋同(经济学)
状态变量
输出反馈
动力系统(定义)
背景(考古学)
线性系统
噪声测量
非线性控制
作者
Santosh Mohan Rajkumar,Debdipta Goswami
摘要
Output regulation in dynamical systems with unknown physics, partial state measurements, and uncharacterized sensor noise remains a central challenge in robotics, aerospace, and autonomous systems. This paper investigates the optimal output regulation problem for nonlinear systems in settings where neither the system dynamics nor an admissible control policy is available in advance. Four factors distinguish the problem considered here from existing approaches: only partial state information is measurable, policy iteration cannot be initiated due to the lack of an admissible stabilizing controller, the output measurements are corrupted by noise with unknown statistics, and no belief state or observer estimate is assumed. To address these challenges, we develop a data driven adaptive dynamic programming framework that learns an optimal output regulating controller directly from streaming noisy output data. The proposed method employs an on policy value iteration scheme that uses a structured quadratic action value function together with a modified temporal difference update to refine the critic. A learned derivative feedback term provides the excitation necessary to identify the residual policy without requiring any knowledge of the system state. The effectiveness of the method is demonstrated through simulation study on a cart–pole output regulation task.
科研通智能强力驱动
Strongly Powered by AbleSci AI