Transmit Power Pool Design for Grant-Free NOMA-IoT Networks via Deep Reinforcement Learning

被引:43
作者
Fayaz, Muhammad [1 ,2 ]
Yi, Wenqiang [3 ]
Liu, Yuanwei [3 ]
Nallanathan, Arumugam [3 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci EECS, London E1 4NS, England
[2] Univ Malakand, Dept Comp Sci & Informat Technol, Chakdara 18800, Pakistan
[3] Queen Mary Univ London, London E1 4NS, England
基金
英国工程与自然科学研究理事会;
关键词
NOMA; Resource management; Power markets; Prototypes; Optimization; Throughput; Wireless communication; Double Q learning; grant-free NOMA; Internet of Things; multi-agent deep reinforcement learning; resource allocation; NONORTHOGONAL MULTIPLE-ACCESS; DYNAMIC SPECTRUM ACCESS; RESOURCE-ALLOCATION;
D O I
10.1109/TWC.2021.3086762
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Grant-free non-orthogonal multiple access (GF-NOMA) is a potential multiple access framework for short-packet internet-of-things (IoT) networks to enhance connectivity. However, the resource allocation problem in GF-NOMA is challenging due to the absence of closed-loop power control. We design a prototype of transmit power pool (PP) to provide open-loop power control. IoT users acquire their transmit power in advance from this prototype PP solely according to their communication distances. Firstly, a multi-agent deep Q-network (DQN) aided GF-NOMA algorithm is proposed to determine the optimal transmit power levels for the prototype PP. More specifically, each IoT user acts as an agent and learns a policy by interacting with the wireless environment that guides them to select optimal actions. Secondly, to prevent the Q-learning model overestimation problem, double DQN (DDQN) based GF-NOMA algorithm is proposed. Numerical results confirm that the DDQN based algorithm finds out the optimal transmit power levels that form the PP. Comparing with the conventional online learning approach, the proposed algorithm with the prototype PP converges faster under changing environments due to limiting the action space based on previous learning. The considered GF-NOMA system outperforms the networks with fixed transmission power, namely all the users have the same transmit power and the traditional GF with orthogonal multiple access techniques, in terms of throughput.
引用
收藏
页码:7626 / 7641
页数:16
相关论文
共 39 条
[1]   A Novel Analytical Framework for Massive Grant-Free NOMA [J].
Abbas, Rana ;
Shirvanimoghaddam, Mahyar ;
Li, Yonghui ;
Vucetic, Branka .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2019, 67 (03) :2436-2449
[2]  
Ahsan W., 2020, ARXIV200708350
[3]  
[Anonymous], 2016, 3GPP TSG RAN WG1 M L
[4]  
[Anonymous], 2016, 3GPP TSG RAN WG1 M Z
[5]   Distributive Dynamic Spectrum Access Through Deep Reinforcement Learning: A Reservoir Computing-Based Approach [J].
Chang, Hao-Hsuan ;
Song, Hao ;
Yi, Yang ;
Zhang, Jianzhong ;
He, Haibo ;
Liu, Lingjia .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02) :1938-1948
[6]   Massive Access for 5G and Beyond [J].
Chen, Xiaoming ;
Ng, Derrick Wing Kwan ;
Yu, Wei ;
Larsson, Erik G. ;
Al-Dhahir, Naofal ;
Schober, Robert .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (03) :615-637
[8]   Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks [J].
Cui, Jingjing ;
Liu, Yuanwei ;
Nallanathan, Arumugam .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (02) :729-743
[9]   Transmit Power Pool Design for Uplink IoT Networks with Grant-free NOMA [J].
Fayaz, Muhammad ;
Yi, Wenqiang ;
Liu, Yuanwei ;
Nallanathan, Arumugam .
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
[10]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1