Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks

计算机科学强化学习阿罗哈吞吐量时分多址计算机网络无线网络无线分布式计算人工智能电信

作者

Yiding Yu,Taotao Wang,Soung Chang Liew

出处

期刊：IEEE Journal on Selected Areas in Communications [Institute of Electrical and Electronics Engineers]
日期：2019-03-12 卷期号：37 (6): 1277-1290 被引量：302

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1109/jsac.2019.2904329

摘要

This paper investigates a deep reinforcement learning (DRL)-based MAC protocol for heterogeneous wireless networking, referred to as a Deep-reinforcement Learning Multiple Access (DLMA). Specifically, we consider the scenario of a number of networks operating different MAC protocols trying to access the time slots of a common wireless medium. A key challenge in our problem formulation is that we assume our DLMA network does not know the operating principles of the MACs of the other networks-i.e., DLMA does not know how the other MACs make decisions on when to transmit and when not to. The goal of DLMA is to be able to learn an optimal channel access strategy to achieve a certain pre-specified global objective. Possible objectives include maximizing the sum throughput and maximizing α-fairness among all networks. The underpinning learning process of DLMA is based on DRL. With proper definitions of the state space, action space, and rewards in DRL, we show that DLMA can easily maximize the sum throughput by judiciously selecting certain time slots to transmit. Maximizing general α-fairness, however, is beyond the means of the conventional reinforcement learning (RL) framework. We put forth a new multi-dimensional RL framework that enables DLMA to maximize general α-fairness. Our extensive simulation results show that DLMA can maximize sum throughput or achieve proportional fairness (two special classes of α-fairness) when coexisting with TDMA and ALOHA MAC protocols without knowing they are TDMA or ALOHA. Importantly, we show the merit of incorporating the use of neural networks into the RL framework (i.e., why DRL and not just traditional RL): specifically, the use of DRL allows DLMA (i) to learn the optimal strategy with much faster speed and (ii) to be more robust in that it can still learn a near-optimal strategy even when the parameters in the RL framework are not optimally set.

求助该文献

Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks

今日热心研友