计算机科学
强化学习
可解释性
可扩展性
人工智能
可观测性
图形
稳健性(进化)
启发式
机器学习
分布式计算
理论计算机科学
生物化学
数据库
数学
基因
化学
应用数学
作者
Zhenhui Ye,Ke Wang,Yining Chen,Xiaohong Jiang,Guanghua Song
标识
DOI:10.1109/tmc.2022.3146881
摘要
In this paper, we aim to design a deep reinforcement learning (DRL) based control solution to navigating a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under partial observation, which serves as Mobile Base Stations (MBSs) providing optimal communication coverage for the ground mobile users. To handle the information loss caused by the partial observability, we introduce a novel network architecture named Deep Recurrent Graph Network (DRGN), which could obtain extra spatial information through graph-convolution based inter-UAV communication, and utilize historical features with a recurrent unit. Based on DRGN and maximum-entropy learning, we propose a stochastic DRL policy named Soft Deep Recurrent Graph Network (SDRGN). In SDRGN, a heuristic reward function is elaborated, which is based on the local information of each UAV instead of the global information; thus, SDRGN reduces the training cost and enables distributed online learning. We conducted extensive experiments to design the structure of DRGN and examine the performance of SDRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and three heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of SDRGN.
科研通智能强力驱动
Strongly Powered by AbleSci AI