发布文献求助

亲爱的研友该休息了！由于当前在线用户较少，发布求助请尽量完整的填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！身体可是革命的本钱，早点休息，好梦！

Safe Reinforcement Learning via Curriculum Induction

重置（财务）强化学习计算机科学任务（项目管理）概率逻辑课程人工智能自主代理人控制器（灌溉）风险分析（工程）人机交互机器学习工程类系统工程生物医学金融经济学教育学经济心理学农学

作者

Matteo Turchetta,Andrey Kolobov,Shital Shah,Andreas Krause,Alekh Agarwal

出处

期刊：Cornell University - arXiv 日期：2020-01-01 被引量：38

链接

arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2006.12136

摘要

In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly. In such settings, the agent needs to behave safely not only after but also while learning. To achieve this, existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations during exploration with high probability, but both the probabilistic guarantees and the smoothness assumptions inherent in the priors are not viable in many scenarios of interest such as autonomous driving. This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor that saves the agent from violating constraints during learning. In this model, we introduce the monitor that neither needs to know how to do well at the task the agent is learning nor needs to know how the environment works. Instead, it has a library of reset controllers that it activates when the agent starts behaving dangerously, preventing it from doing damage. Crucially, the choices of which reset controller to apply in which situation affect the speed of agent learning. Based on observing agents' progress, the teacher itself learns a policy for choosing the reset controllers, a curriculum, to optimize the agent's final policy reward. Our experiments use this framework in two environments to induce curricula for safe and efficient learning.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 斯寜上传了应助文件

2秒前; bc上传了应助文件

3秒前; 乐乐上传了应助文件

6秒前; 人间月色发布了新的文献求助20

8秒前; Foxjker完成签到，获得积分10

9秒前; 善学以致用的应助被三和小神采纳，获得10

10秒前; violet发布了新的文献求助10

11秒前; 丘比特上传了应助文件

14秒前; 绝尘发布了新的文献求助10

19秒前; William_l_c完成签到，获得积分10

19秒前; 斯寜上传了应助文件

28秒前; FelixChen的应助被William_l_c采纳，获得10

30秒前; 一二三砰完成签到，获得积分10

35秒前; 生物科研小白完成签到，获得积分10

36秒前; 彭于晏上传了应助文件

42秒前; xiao_niu完成签到，获得积分10

42秒前; chenjzhuc上传了应助文件

43秒前; bing完成签到，获得积分10

44秒前; 糖葫芦发布了新的文献求助10

48秒前; 暮沐晓光完成签到，获得积分10

48秒前; 赘婿上传了应助文件

52秒前; 密码小白完成签到，获得积分10

54秒前; 糖葫芦完成签到，获得积分10

57秒前; 眯眯眼的迎荷发布了新的文献求助10

59秒前; 斯寜上传了应助文件

1分钟前; 眯眯眼的迎荷完成签到，获得积分10

1分钟前; 科研通AI2S上传了应助文件

1分钟前; bc上传了应助文件

1分钟前; 我是老大上传了应助文件

1分钟前; lmplzzp发布了新的文献求助10

1分钟前; 木木完成签到，获得积分10

1分钟前; 共享精神的应助被老实醉冬采纳，获得10

1分钟前; bc上传了应助文件

1分钟前; 木头完成签到，获得积分10

1分钟前; galaxy完成签到，获得积分10

1分钟前; 桐桐的应助被Vicki采纳，获得10

2分钟前; 科研通AI5上传了应助文件

2分钟前; AAA咸鱼批发发布了新的文献求助10

2分钟前; bc上传了应助文件

2分钟前; 西瓜皮完成签到，获得积分10

2分钟前

高分求助中: Technologies supporting mass customization of apparel: A pilot project 600; Introduction to Strong Mixing Conditions Volumes 1-3 500; Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400; Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320; 共融服務學習指南 300; Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300; Peking Blues // Liao San 300

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3800880; 求助须知：如何正确求助？哪些是违规求助？ 3346402; 关于积分的说明 10329217; 捐赠科研通 3062864; 什么是DOI，文献DOI怎么找？ 1681220; 邀请新用户注册赠送积分活动 807463; 科研通“疑难数据库（出版商）”最低求助积分说明 763702

今日热心研友

活力的柠檬

可千万不要躺平呀

痴情的博超

优秀的流沙

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通