作者
Haipeng Xiao,Lijun Fu,Chengya Shang,Yunfeng Lin,Yaxiang Fan
摘要
In recent years, with the development of ship intelligence and uncrewed technology, autonomous path planning of uncrewed surface vehicle (USV) has become very important. However, most of the existing studies ignore the limitation of the endurance and speed of USV to voyage, and only consider the course planning at the fixed speed. To this end, we propose a path planning method based on deep reinforcement learning (DRL): first, we establish a USV motion model. Then, we propose a new USV energy consumption (EC) model. Existing research typically builds USV EC model by considering the combined effects of currents, winds, and waves, which needs precise ocean data and involves a complex modeling process. Moreover, it fail to account for the influence of USV’s own speed on EC. In contrast, based on the relationship between speed, marine environment, and propulsion load, we propose a new USV EC model. This model simplifies the modeling process and links USV EC with sailing speed and ocean environmental conditions. Next, the original soft-actor-critic (SAC) algorithm use the multilayer perceptron (MLP) as the action network. However, convolutional neural networks (CNNs) excel in capturing spatial and local features. Compared to MLP, CNN have stronger information acquisition capabilities. Therefore, we propose a model (FRCF) that combines fully connected layers with CNN to replace MLP as the action network of the SAC agent, aiming to enhance the agent’s convergence speed and performance. Finally, utilizing SAC with FRCF as the action network (SAC-FRCF), along with the motion models and the new USV EC model, we achieved multiobjective cooperative intelligent path planning for energy, speed, and heading. Unlike traditional path planning methods that control the USV heading based on discrete heading angular, we adjust USV speed and heading based on continuous acceleration and angular velocity. Meanwhile, we impose constraints on the USV’s speed, heading angle, acceleration, and angular velocity to ensure that its motion complies with kinematic constraints. Experimental results show that SAC-FRCF reduces the exploration time by 46.40% and has stronger convergence performance as well as better path planning effect compared with SAC algorithm using MLP as action network (SAC-MLP). Compared with SAC-MLP without considering the optimal EC, it reduces the EC by 28.38%. The experimental results verify the effectiveness and superiority of the proposed method. Furthermore, the method we propose exhibits significant advantages over traditional path planning methods. In addition, our proposed method demonstrates significant superiority compared to existing path planning methods, and its robustness and adaptability also have been verified in a more complex environment.