发布文献求助

Real-Time Reinforcement Learning

强化学习钢筋计算机科学人工智能心理学社会心理学

作者

Simon Ramstedt,Christopher Pal

出处

期刊：Cornell University - arXiv 日期：2019-01-01 被引量：17

链接

arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.1911.04448

摘要

Markov Decision Processes (MDPs), the mathematical framework underlying most algorithms in Reinforcement Learning (RL), are often used in a way that wrongfully assumes that the state of an agent's environment does not change during action selection. As RL systems based on MDPs begin to find application in real-world safety critical situations, this mismatch between the assumptions underlying classical MDPs and the reality of real-time computation may lead to undesirable outcomes. In this paper, we introduce a new framework, in which states and actions evolve simultaneously and show how it is related to the classical MDP formulation. We analyze existing algorithms under the new real-time formulation and show why they are suboptimal when used in real-time. We then use those insights to create a new algorithm Real-Time Actor-Critic (RTAC) that outperforms the existing state-of-the-art continuous control algorithm Soft Actor-Critic both in real-time and non-real-time settings. Code and videos can be found at https://github.com/rmst/rtrl.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 桐桐上传了应助文件

刚刚; 脑洞疼上传了应助文件

1秒前; 华仔上传了应助文件

1秒前; EOFG0PW发布了新的文献求助20

3秒前; wang完成签到，获得积分10

4秒前; shYnEss发布了新的文献求助20

4秒前; zz发布了新的文献求助10

4秒前; dong发布了新的文献求助10

5秒前; qq发布了新的文献求助10

5秒前; 科研通AI2S上传了应助文件

5秒前; monere发布了新的文献求助10

6秒前; 98484的应助被shYnEss采纳，获得10

10秒前; 研友_VZG7GZ的应助被qq采纳，获得10

12秒前; Aaaa发布了新的文献求助20

13秒前; 研友_VZG7GZ上传了应助文件

16秒前; ZhouYW上传了应助文件

18秒前; 轩轩轩轩完成签到，获得积分10

19秒前; 斯文的小旋风上传了应助文件

19秒前; 祁祁的应助被快乐的小乌龟采纳，获得10

19秒前; 发财暴富财富自由没我快吧嘻嘻嘻上传了应助文件

20秒前; 亚亚发布了新的文献求助10

22秒前; 烟花上传了应助文件

22秒前; cdercder的应助被使用过有几个采纳，获得10

25秒前; 科研通AI5的应助被使用过有几个采纳，获得10

25秒前; 科研通AI5的应助被叻居居采纳，获得30

27秒前; 上官若男上传了应助文件

27秒前; ZhouYW上传了应助文件

27秒前; 华仔上传了应助文件

27秒前; 无私期待完成签到，获得积分10

28秒前; chenu发布了新的文献求助10

28秒前; 是一颗大树呀发布了新的文献求助10

29秒前; 刀刀发布了新的文献求助10

30秒前; Naza1119关闭了Naza1119的文献求助

30秒前; saisyo完成签到，获得积分10

33秒前; 科研通AI5上传了应助文件

34秒前; 司空豁发布了新的文献求助10

34秒前; 科研通AI5的应助被shasha采纳，获得10

34秒前; splaker7完成签到，获得积分10

34秒前; 英俊的铭的应助被虚心的芷蝶采纳，获得10

35秒前; yzm发布了新的文献求助10

36秒前

高分求助中: Les Mantodea de Guyane Insecta, Polyneoptera 2500; Technologies supporting mass customization of apparel: A pilot project 450; China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430; Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400; A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400; China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400; The Healthy Socialist Life in Maoist China, 1949–1980 400

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3791034; 求助须知：如何正确求助？哪些是违规求助？ 3335765; 关于积分的说明 10276743; 捐赠科研通 3052313; 什么是DOI，文献DOI怎么找？ 1675100; 邀请新用户注册赠送积分活动 803082; 科研通“疑难数据库（出版商）”最低求助积分说明 761066

今日热心研友

昏睡的蟠桃

发财暴富财富自由没我快吧嘻嘻嘻

就叫柠檬吧

科研小民工

平常的毛豆

等待的忆枫

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通