Multi-UAV Rendezvous Trajectory Planning Based on Improved MADDPG Algorithm in Complex Dynamic Obstacle Environments
作者
Xiaojun Xing,Yan Ma,Yichen Lei,Yan Li,Bing Xiao
出处
期刊:IEEE Transactions on Vehicular Technology [Institute of Electrical and Electronics Engineers] 日期:2025-01-01卷期号:: 1-12
标识
DOI:10.1109/tvt.2025.3624052
摘要
Traditional trajectory planning algorithms for multi-UAVs face challenges such as difficulty in establishing cooperative mechanisms and poor adaptability to dynamic obstacle environments. To address these limitations, an enhanced reinforcement learning algorithm, based on the multi-agent deep deterministic policy gradient algorithm (MADDPG) and attention mechanism, is proposed for multi-UAV rendezvous trajectory planning in unknown complex environments. Firstly, the algorithm innovatively introduces an attention mechanism in deep learning into the centralized critic network of the MADDPG, enabling the model to dynamically adjust attention in complex environments and enhance learning efficiency; secondly, a dense reward function model based on guiding points is developed, combining attractive and repulsive forces, effectively addressing the issue of sparse rewards, accelerating the algorithm's convergence rate, and bettering policy learning efficiency; thirdly, an Ornstein-Uhlenbeck (OU) noise network is incorporated to well balance exploration and exploitation during the training process; finally, in the static obstacle environment, dynamic obstacle environment and extended composite scenarios, this algorithm was compared with MADDPG, MATD3, and IDDPG. The results show that the improved algorithm can effectively avoid collisions, successfully rendezvous at the target point, and achieve the minimum decision steps, the shortest trajectory length and the highest rendezvous success rate. Especially in scenarios with multiple dynamic obstacles, the improved algorithm can adjust the UAV flight path in real-time and successfully avoid all dynamic obstacles.