强化学习
计算机科学
高效能源利用
诺玛
数学优化
趋同(经济学)
深度学习
弹道
最优化问题
延迟(音频)
人工智能
算法
电信
数学
电信线路
工程类
物理
天文
经济增长
电气工程
经济
作者
Zhaoyi Feng,Zhichao Sheng,Ye Shi,Ali A. Nasir,Yong Fang
标识
DOI:10.1109/lcomm.2023.3305537
摘要
Non-orthogonal multiple access (NOMA) is regarded as a promising solution to improve the energy efficiency and reduce the latency of the unmanned aerial vehicle (UAV)-aided networks. In this letter, we consider an energy-efficient multi-UAV incorporating hybrid NOMA data collection system. Explicitly, the optimization problem of joint trajectory design and power allocation is formulated for maximizing energy utilization of the system. The optimization problem is a mixed integer non-convex problem and involves continuous variables. To tackle this challenging problem, we utilize a multi-agent deep reinforcement learning (MADRL) approach, i.e., multi-agent Twin Delayed Deep Deterministic Policy Gradient (MATD3), which introduces clipped double Q-learning and deep networks to reduce overestimation bias. Furthermore, a reward shaping method is applied to speed up the learning efficiency and convergence. Corroborated by extensive experiments, the proposed hybrid NOMA enhanced multi-UAV outperforms pure NOMA and OMA cases.
科研通智能强力驱动
Strongly Powered by AbleSci AI