干扰
雷达
强化学习
马尔可夫决策过程
跳频扩频
计算机科学
拦截
电子战
适应(眼睛)
区间(图论)
马尔可夫过程
数学优化
电子工程
工程类
人工智能
电信
数学
统计
物理
光学
组合数学
热力学
生物
生态学
作者
Ailiya,Wei Yi,Pramod K. Varshney
出处
期刊:IEEE Transactions on Vehicular Technology
[Institute of Electrical and Electronics Engineers]
日期:2022-08-09
卷期号:71 (12): 12434-12449
被引量:34
标识
DOI:10.1109/tvt.2022.3197425
摘要
In modern electronic warfare, it is becoming very important to develop intelligent and adaptive radar anti-jamming methods since jammers can now launch increasingly complex and unpredictable attacks. Besides, in practice, the jamming strategy is usually unknown to the radar. To overcome the limitations caused by the lack of information about the jammer, reinforcement learning is applied to radar anti-jamming in this paper via the adaptation of frequency hopping interval. In reinforcement learning, the sequential decision problem to solve is described as a Markov Decision Process (MDP). To describe the sequential radar anti-jamming decision making process, a detailed radar anti-jamming MDP model is formulated. To balance between integration efficiency and probability of interception, a flexible adjustable tradeoff between them is devised by defining the reward function of the MDP as the weighted sum of the integration efficiency factor and the probability of interception factor. Two properties of the MDP value function are proved. These properties are used to derive the optimal frequency hopping time interval for different pulse widths under the RL framework. Simulation results show that the proposed radar anti-jamming strategy can adapt to the jamming environment well and can control its performance flexibly by adjusting the weights of integration efficiency and probability of interception.
科研通智能强力驱动
Strongly Powered by AbleSci AI