Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

强化学习 计算机科学 人工智能 多智能体系统 领域(数学分析) 钢筋 心理学 数学 社会心理学 数学分析
作者
Jianye Hao,Tianpei Yang,Hongyao Tang,Chenjia Bai,Jinyi Liu,Zhaopeng Meng,Peng Liu,Zhen Wang
出处
期刊:IEEE transactions on neural networks and learning systems [Institute of Electrical and Electronics Engineers]
卷期号:35 (7): 8762-8782 被引量:70
标识
DOI:10.1109/tnnls.2023.3236361
摘要

Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant successes across a wide range of domains, including game AI, autonomous vehicles, robotics, and so on. However, DRL and deep MARL agents are widely known to be sample inefficient that millions of interactions are usually needed even for relatively simple problem settings, thus preventing the wide application and deployment in real-industry scenarios. One bottleneck challenge behind is the well-known exploration problem, i.e., how efficiently exploring the environment and collecting informative experiences that could benefit policy learning towards the optimal ones. This problem becomes more challenging in complex environments with sparse rewards, noisy distractions, long horizons, and non-stationary co-learners. In this paper, we conduct a comprehensive survey on existing exploration methods for both single-agent and multi-agent RL. We start the survey by identifying several key challenges to efficient exploration. Beyond the above two main branches, we also include other notable exploration methods with different ideas and techniques. In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks. According to our algorithmic and empirical investigation, we finally summarize the open problems of exploration in DRL and deep MARL and point out a few future directions.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
zhouyan完成签到,获得积分10
1秒前
oishiiwawa完成签到,获得积分10
1秒前
研友_nPPERn完成签到,获得积分10
2秒前
芳芳反复完成签到,获得积分20
2秒前
jiang完成签到,获得积分10
2秒前
小超完成签到,获得积分10
3秒前
落羽发布了新的文献求助10
3秒前
落叶解三秋完成签到,获得积分10
4秒前
kunkun完成签到,获得积分10
5秒前
5秒前
5秒前
5秒前
司空豁应助刘迪采纳,获得10
5秒前
大爱炫炫完成签到,获得积分10
5秒前
Febrine0502完成签到,获得积分10
5秒前
昏睡的白桃完成签到,获得积分10
6秒前
大力的忆霜完成签到 ,获得积分10
6秒前
111完成签到,获得积分10
6秒前
liujiaqi完成签到,获得积分10
7秒前
8秒前
8秒前
搜集达人应助健忘的金采纳,获得10
9秒前
威武忆山完成签到 ,获得积分10
9秒前
传奇3应助灯灯采纳,获得10
10秒前
捡了小猫名为苍狗完成签到,获得积分10
11秒前
乘舟江行发布了新的文献求助30
11秒前
11秒前
forever完成签到,获得积分10
11秒前
wgy发布了新的文献求助10
12秒前
雾里看花发布了新的文献求助10
12秒前
12秒前
Robigo完成签到,获得积分10
12秒前
xuyi完成签到,获得积分10
13秒前
CX完成签到,获得积分10
13秒前
嗦了蜜发布了新的文献求助10
13秒前
嘟嘟嘟嘟完成签到 ,获得积分10
13秒前
深情荆发布了新的文献求助10
13秒前
JamesPei应助小鱼采纳,获得10
14秒前
QQ完成签到 ,获得积分10
14秒前
anna1992发布了新的文献求助10
14秒前
高分求助中
Africanfuturism: African Imaginings of Other Times, Spaces, and Worlds 3000
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 2000
Electron microscopy study of magnesium hydride (MgH2) for Hydrogen Storage 1000
Structural Equation Modeling of Multiple Rater Data 700
 Introduction to Comparative Public Administration Administrative Systems and Reforms in Europe, Third Edition 3rd edition 590
全球膝关节骨性关节炎市场研究报告 555
Exhibiting Chinese Art in Asia: Histories, Politics and Practices 540
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3892854
求助须知:如何正确求助?哪些是违规求助? 3435741
关于积分的说明 10794735
捐赠科研通 3160888
什么是DOI,文献DOI怎么找? 1745720
邀请新用户注册赠送积分活动 842992
科研通“疑难数据库(出版商)”最低求助积分说明 787020