发布文献求助

Robust Reinforcement Learning for Risk-Sensitive Linear Quadratic Gaussian Control

线性二次高斯控制强化学习控制理论（社会学）鲁棒控制高斯分布二次方程计算机科学线性控制系统高斯过程数学控制（管理）线性系统人工智能数学优化控制系统工程类数学分析物理电气工程量子力学几何学

作者

Leilei Cui,Tamer Başar,Zhong‐Ping Jiang

出处

期刊：IEEE Transactions on Automatic Control [Institute of Electrical and Electronics Engineers]
日期：2024-05-07 卷期号：69 (11): 7678-7693 被引量：7

标识

DOI：10.1109/tac.2024.3397928

摘要

This paper proposes a novel robust reinforcement learning framework for discrete-time linear systems with model mismatch that may arise from the sim-to-real gap. A key strategy is to invoke advanced techniques from control theory. Using the formulation of the classical risk-sensitive linear quadratic Gaussian control, a dual-loop policy optimization algorithm is proposed to generate a robust optimal controller. The dual-loop policy optimization algorithm is shown to be globally and uniformly convergent, and robust against disturbances during the learning process. This robustness property is called small-disturbance input-to-state stability and guarantees that the proposed policy optimization algorithm converges to a small neighborhood of the optimal controller as long as the disturbance at each learning step is relatively small. In addition, when the system dynamics is unknown, a novel model-free off-policy policy optimization algorithm is proposed. Finally, numerical examples are provided to illustrate the proposed algorithm.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

新增更精细的自定义提醒设置 (2026-1-4)

新增

🕒每天60秒读懂世界·精选全球要闻 (2026-1-2)

更新

2025年影响因子查询已上线 (2025-6-18)

新增

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 李健的粉丝团团长上传了应助文件

刚刚; 我是老大上传了应助文件

刚刚; 111完成签到，获得积分20

刚刚; cc完成签到，获得积分10

1秒前; 心肌细胞完成签到，获得积分10

1秒前; 占易形发布了新的文献求助10

2秒前; tcf上传了应助文件

2秒前; 超帅曼柔完成签到，获得积分10

2秒前; 脆皮小小酥关闭了脆皮小小酥的文献求助

3秒前; 陈乔关闭了陈乔的文献求助

3秒前; FashionBoy上传了应助文件

3秒前; libra0009完成签到，获得积分10

3秒前; killa完成签到，获得积分10

3秒前; 科研通AI2S上传了应助文件

3秒前; Blessedone发布了新的文献求助10

3秒前; 不呐呐完成签到，获得积分10

4秒前; 刘乐完成签到，获得积分10

4秒前; 灰灰发布了新的文献求助10

5秒前; 我是老大上传了应助文件

5秒前; 科研人完成签到，获得积分10

5秒前; TJH完成签到，获得积分10

5秒前; oceandad完成签到，获得积分10

5秒前; 嫁接诺贝尔上传了应助文件

5秒前; 5555发布了新的文献求助10

6秒前; koala完成签到，获得积分10

6秒前; 桐桐的应助被MgZn采纳，获得10

6秒前; 11发布了新的文献求助10

7秒前; 123发布了新的文献求助10

7秒前; 彭于晏上传了应助文件

7秒前; 康园完成签到，获得积分10

7秒前; 情怀的应助被、、采纳，获得10

8秒前; 南宫愚志发布了新的文献求助10

8秒前; 量子星尘发布了新的文献求助10

8秒前; Sylwren关闭了Sylwren的文献求助

8秒前; 科研通AI6上传了应助文件

8秒前; wangwang发布了新的文献求助10

9秒前; NI发布了新的文献求助10

10秒前; 英俊的铭上传了应助文件

10秒前; ding的应助被尼古拉斯采纳，获得10

10秒前; WYF发布了新的文献求助10

11秒前

高分求助中: Theoretical Modelling of Unbonded Flexible Pipe Cross-Sections 10000; (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; 《药学类医疗服务价格项目立项指南（征求意见稿）》 880; 花の香りの秘密―遺伝子情報から機能性まで 800; 3rd Edition Group Dynamics in Exercise and Sport Psychology New Perspectives Edited By Mark R. Beauchamp, Mark Eys Copyright 2025 600; 1st Edition Sports Rehabilitation and Training Multidisciplinary Perspectives By Richard Moss, Adam Gledhill 600; Digital and Social Media Marketing 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5620188; 求助须知：如何正确求助？哪些是违规求助？ 4704708; 关于积分的说明 14929099; 捐赠科研通 4761278; 什么是DOI，文献DOI怎么找？ 2550838; 邀请新用户注册赠送积分活动 1513615; 关于科研通互助平台的介绍 1474523

今日热心研友

在这无人的城堡肆无忌惮的奔跑

昏睡的蟠桃

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2026 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：821889395【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通