Throughput Optimization for Grant-Free Multiple Access With Multiagent Deep Reinforcement Learning

被引：16

作者：

Huang, Rui ^{[1
]}

Wong, Vincent W. S. ^{[1
]}

Schober, Robert ^{[2
]}

机构：

[1] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada

[2] Friedrich Alexander Univ Erlangen Nuremberg, Inst Digital Commun, D-91058 Erlangen, Germany

来源：

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS | 2021年 / 20卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Grant-free multiple access (GFMA); deep reinforcement learning (DRL); medium access control~(MAC) protocols; Internet of Things (IoT);

D O I：

10.1109/TWC.2020.3024166

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Grant-free multiple access (GFMA) is a promising paradigm to efficiently support uplink access of Internet of Things (IoT) devices. In this paper, we propose a deep reinforcement learning (DRL)-based pilot sequence selection scheme for GFMA systems to mitigate potential pilot sequence collisions. We formulate a pilot sequence selection problem for aggregate throughput maximization in GFMA systems with specific throughput constraints as a Markov decision process (MDP). By exploiting multiagent DRL, we train deep neural networks (DNNs) to learn near-optimal pilot sequence selection policies from the transition history of the underlying MDP without requiring information exchange between the users. While the training process takes advantage of global information, we leverage the technique of factorization to ensure that the policies learned by the DNNs can be executed in a distributed manner. Simulation results show that the proposed scheme can achieve an average aggregate throughput that is within 85% of the optimum, and is 31%, 128%, and 162% higher than that of acknowledgement-based GFMA, dynamic access class barring, and random selection GFMA, respectively. Our results also demonstrate the capability of the proposed scheme to support IoT devices with specific throughput requirements.

引用

页码：228 / 242

页数：15

共 41 条

[1] EP-Based Joint Active User Detection and Channel Estimation for Massive Machine-Type Communications [J].

Ahn, Jinyoup ;

Shim, Byonghyo ;

Lee, Kwang Bok .

IEEE TRANSACTIONS ON COMMUNICATIONS, 2019, 67 (07) :5178-5189

[2]

[Anonymous], 2019, document TS 23.501 V16.1.0,

[3]

[Anonymous], 2012, NEURAL NETWORKS TRIC

[4]

[Anonymous], 2013, Multiagent Systems

[5]

[Anonymous], 2017, P INT C MACH LEARN S

[6]

[Anonymous], 2017, 2017 IEEE INT C COMM

[7]

Cisco, 2020, CISCO ANN INTERNET

[8] D-ACB: Adaptive Congestion Control Algorithm for Bursty M2M Traffic in LTE Networks [J].

Duan, Suyang ;

Shah-Mansouri, Vahid ;

Wang, Zehua ;

Wong, Vincent W. S. .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2016, 65 (12) :9847-9861

[9]

Dugas C, 2009, J MACH LEARN RES, V10, P1239

[10] Distributed Stochastic Online Learning Policies for Opportunistic Spectrum Access [J].

Gai, Yi ;

Krishnamachari, Bhaskar .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2014, 62 (23) :6184-6193

← 1 2 3 4 5 →