强化学习
计算机科学
残余物
人工智能
机器学习
算法
计算机视觉
作者
Shuhuan Wen,Yaqi Shu,A.B. Rad,Zeteng Wen,Zhengzheng Guo,Simeng Gong
标识
DOI:10.1016/j.eswa.2024.125238
摘要
The problem of autonomous navigation has attracted significant attention from robotics research community in the last few decades. In this paper, we address the problem of low data utilization due to the large amount of episode experience value data. A maximum entropy algorithm based on prioritized experience replay (Learning Good Experience based on Soft Actor-Criti, LGE-SAC) is proposed to quickly reproduce past good experience episodes. As the deep reinforcement learning method is susceptible to failure to plan ahead and explore the target position in a long sequence environment, a deep Residual Soft Actor-Critic (RSAC) is proposed to alleviate this problem. The reinforcement learning policy is fused with the Artificial Potential Field method to improve the generalization ability of the proposed algorithm, thus improving robot adaptation in new test environments. In order to validate the effectiveness of the proposed algorithm, we conducted simulation experiments in Gazebo simulator environment and real experiments on a Turtlebot3 robot equipped with LiDAR sensor. Simulation and experiment results show that the proposed algorithm effectively avoids obstacles and succeeds in reaching the goal compared to other obstacle avoidance algorithms. In comparison with the Artificial Potential Field method, the planning success rate of the proposed RSAC algorithm in the test environment is increased by 30%, and at the same time, the number of planning steps is reduced by half, and the generalization ability is improved.
科研通智能强力驱动
Strongly Powered by AbleSci AI