A Deep Reinforcement Learning Method for Multiple AGV Path Planning Based on MATD3 Algorithm

强化学习 计算机科学 运动规划 路径(计算) 算法 人工智能 机器人 计算机网络
作者
Yu-Kai Fu,Ao Xu,Yiyang Liu,Hongfei Bai,Chao Deng
出处
期刊:Unmanned Systems [World Scientific]
卷期号:13 (06): 1531-1544
标识
DOI:10.1142/s2301385025410067
摘要

The automated guided vehicle (AGV) has been widely used in the realm of intelligent logistics, and path planning has become a key challenge in AGV research. In large and complex dynamic environments, multi-AGV unmanned systems have the problems of low search efficiency, slow convergence speed, and even impossible convergence. To accelerate the convergence of AGVs during the learning process, a new deep reinforcement learning method heuristic soft action-multi-agent twin delayed deep deterministic policy gradient (HA-MATD3) algorithm is proposed in this paper. Specifically, a dynamic reward function utilizing an artificial potential field method is introduced to score the actions of the AGVs, and the heuristic soft action and reward network are introduced to optimize the multi-agent twin delayed deep deterministic policy gradient (MATD3) algorithm. First, the AGV generates the ideal heuristic soft action through its state and target information, and the AGV can effectively solve the problem of low search efficiency through heuristic soft action learning. Furthermore, the reward network is used to judge the reward value of the action taken by the AGV, ensuring that the generated path is efficient, collision-free and safer. These improvements enrich the decision-making process and improve the adaptability and responsiveness of AGVs to various environmental conditions. Finally, experimental results demonstrate that the proposed HA-MATD3 algorithm is effective in solving the multi-AGV path planning problem in complex environments. This research contributes to the development of unmanned systems, especially in the multi-AGV path planning problem.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
大知闲闲完成签到,获得积分10
1秒前
1秒前
辰_发布了新的文献求助10
2秒前
SciGPT应助小小娜采纳,获得10
2秒前
123456发布了新的文献求助10
2秒前
3秒前
小小完成签到 ,获得积分10
3秒前
Sky我的小清新完成签到,获得积分10
3秒前
3秒前
彪壮的邑完成签到,获得积分10
4秒前
大豪发布了新的文献求助10
4秒前
4秒前
4秒前
YT完成签到,获得积分10
4秒前
奋斗的小甜瓜完成签到 ,获得积分10
4秒前
5秒前
风趣的罗纳尔多完成签到,获得积分10
5秒前
5秒前
夜夜完成签到,获得积分10
5秒前
5秒前
犹豫的芝麻完成签到 ,获得积分10
6秒前
个性的忆梅完成签到,获得积分10
6秒前
无奈的秋玲完成签到,获得积分20
6秒前
王冬越完成签到 ,获得积分10
7秒前
哈基米发布了新的文献求助10
7秒前
zjm完成签到,获得积分10
7秒前
goodgoodstudy完成签到,获得积分10
7秒前
格纹完成签到,获得积分10
8秒前
科研通AI6.1应助于沁冉采纳,获得10
8秒前
8秒前
cdercder应助迷你的友卉采纳,获得10
8秒前
研友_QLX7x8完成签到,获得积分10
8秒前
8秒前
th发布了新的文献求助10
8秒前
大模型应助帝蒼采纳,获得10
9秒前
懵懂的随阴完成签到,获得积分10
9秒前
完美世界应助Yeung采纳,获得30
9秒前
范森林完成签到 ,获得积分10
9秒前
9秒前
高分求助中
Adhesion Science: Principles & Practice 1234
Signals, Systems, and Signal Processing 610
Burger's Medicinal Chemistry and Drug Discovery 400
A Step-by-Step Guide to Qualitative Data Coding 2nd Edition 400
Impact of Storage Orientation and Duration on Prefilled Syringe Performance: Break-Loose and Glide Forces, and Injection Time Across Multiple Time Points 360
Programming for Chemical Engineers Using C, C++, and MATLAB 300
Upland Kenya wild flowers and ferns: a flora of the flowers, ferns, grasses, and sedges of highland Kenya 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6666219
求助须知:如何正确求助?哪些是违规求助? 8415702
关于积分的说明 17989928
捐赠科研通 5872688
什么是DOI,文献DOI怎么找? 2976080
邀请新用户注册赠送积分活动 1951895
关于科研通互助平台的介绍 1879100