Deep Reinforcement Learning-Based Multichannel Access for Industrial Wireless Networks With Dynamic Multiuser Priority

强化学习计算机科学马尔可夫决策过程无线无线网络趋同（经济学）马尔可夫过程分布式计算机器学习电信数学经济增长统计经济

作者

Xiaoyu Liu,Chi Xu,Haibin Yu,Peng Zeng

出处

期刊：IEEE Transactions on Industrial Informatics [Institute of Electrical and Electronics Engineers]
日期：2021-12-31 卷期号：18 (10): 7048-7058 被引量：21

标识

DOI：10.1109/tii.2021.3139349

摘要

In Industry 4.0, massive heterogeneous industrial devices generate a great deal of data with different quality of service requirements, and communicate via industrial wireless networks (IWNs). However, the limited time-frequency resources of IWNs cannot well support the high concurrent access of massive industrial devices with strict real-time and reliable communication requirements. To address this problem, a deep reinforcement learning-based dynamic priority multichannel access (DRL-DPMCA) algorithm is proposed in this article. Firstly, according to the time-sensitivity of industrial data, industrial devices are assigned with different priorities, based on which their channel access probabilities are dynamically adjusted. Then, the Markov decision process is utilized to model the dynamic priority multichannel access problem. To cope with the explosion of state space caused by the multichannel access of massive industrial devices with dynamic priorities, DRL is used to establish the mapping from states to actions. Next, the long-term cumulative reward is maximized to obtain an effective policy. Especially, with joint consideration of the access reward and priority reward, a compound reward for multichannel access and dynamic priority is designed. For breaking the time correlation of training data while accelerating the convergence of DRL-DPMCA, an experience replay with experience-weight is proposed to store and sample experiences categorically. Besides, the gated recurrent unit, dueling architecture and step-by-step

$\varepsilon$

-greedy method are employed to make states more comprehensive and reduce model oscillation. Extensive experiments show that, compared with slotted-Aloha and deep Q network algorithms, DRL-DPMCA converges quickly, and guarantees the highest channel access probability and the minimum queuing delay for high-priority industrial devices in the context of minimum access conflict and nearly 100% channel utilization.

求助该文献

Deep Reinforcement Learning-Based Multichannel Access for Industrial Wireless Networks With Dynamic Multiuser Priority

今日热心研友