强化学习
计算机科学
运动规划
路径(计算)
算法
人工智能
机器人
计算机网络
作者
Yu-Kai Fu,Ao Xu,Yiyang Liu,Hongfei Bai,Chao Deng
标识
DOI:10.1142/s2301385025410067
摘要
The automated guided vehicle (AGV) has been widely used in the realm of intelligent logistics, and path planning has become a key challenge in AGV research. In large and complex dynamic environments, multi-AGV unmanned systems have the problems of low search efficiency, slow convergence speed, and even impossible convergence. To accelerate the convergence of AGVs during the learning process, a new deep reinforcement learning method heuristic soft action-multi-agent twin delayed deep deterministic policy gradient (HA-MATD3) algorithm is proposed in this paper. Specifically, a dynamic reward function utilizing an artificial potential field method is introduced to score the actions of the AGVs, and the heuristic soft action and reward network are introduced to optimize the multi-agent twin delayed deep deterministic policy gradient (MATD3) algorithm. First, the AGV generates the ideal heuristic soft action through its state and target information, and the AGV can effectively solve the problem of low search efficiency through heuristic soft action learning. Furthermore, the reward network is used to judge the reward value of the action taken by the AGV, ensuring that the generated path is efficient, collision-free and safer. These improvements enrich the decision-making process and improve the adaptability and responsiveness of AGVs to various environmental conditions. Finally, experimental results demonstrate that the proposed HA-MATD3 algorithm is effective in solving the multi-AGV path planning problem in complex environments. This research contributes to the development of unmanned systems, especially in the multi-AGV path planning problem.
科研通智能强力驱动
Strongly Powered by AbleSci AI