强化学习
计算机科学
样品(材料)
数学优化
比例(比率)
控制(管理)
最优化问题
人工智能
对抗制
功能(生物学)
导弹
工程类
算法
数学
化学
物理
色谱法
量子力学
进化生物学
生物
航空航天工程
作者
Weilin Luo,Jinhu Lü,Kexin Liu,Lei Chen
标识
DOI:10.1109/tsmc.2021.3096997
摘要
The missile-target assignment (MTA) is a typical weapon-target assignment problem in Command and Control of modern warfare. Despite the significance of the problem, traditional algorithms still lack efficiency, solution quality, and practicability in the adversarial environment. In this article, we propose a data-driven policy optimization with deep reinforcement learning (PODRL) for the adversarial MTA. We design a comprehensive reward function to motivate the optimization of assignment policy. As such, the learned policy can implicitly model the penetration of missiles under an adversarial environment in a data-driven way. We also present a fair sample strategy to improve the sample efficiency and accelerate the policy optimization. Experimental results show that PODRL can adaptively generate satisfactory solutions in both small-scale and large-scale instances. Furthermore, we evaluate the effectiveness of PODRL in a multiobjective scenario. The result demonstrates that a well-optimized policy can achieve high-quality allocation and demand forecast of the missile resources simultaneously.
科研通智能强力驱动
Strongly Powered by AbleSci AI