Multi-Agent Reinforcement Learning Based Energy Efficiency Optimization in NB-IoT Networks

被引:8
作者
Guo, Yuancheng [1 ]
Xiang, Min [1 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, London, England
来源
2019 IEEE GLOBECOM WORKSHOPS (GC WKSHPS) | 2019年
关键词
NB-IoT; MARL; WoLF-PHC; power ramping; preamble allocation; energy efficiency;
D O I
10.1109/gcwkshps45667.2019.9024676
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Based on the existing Evolved Packet System (EPS) architecture, Narrowband Internet of Things (NB-IoT) has been expected as a promising paradigm to support energy-aware massive Machine Type Communications (mMTC). However, with the tremendous increase of IoT devices, as well as their requirements of energy-saving and low-cost, current power ramping and preamble allocation mechanisms in legacy long term evolution (LTE) can hardly achieve high energy efficiency in machine-to-machine (M2M) communications, mainly resulting from the significant redundancy of control signals. Due to the strict restrictions of NB-IoT, up till the present moment, the standardized preamble allocation mechanism is still randomly picking. To satisfy these constrained conditions in NB-IoT, this work proposes a joint optimization framework of power ramping and preamble picking to improve the energy efficiency of NB-IoT systems. In this optimization problem, a comprehensive energy estimation model is established, which investigates the inadequacy of random access (RA) procedure and meanwhile reveals the effects of power ramping and preamble picking on energy efficiency. In addition, to search the optimal policies of the joint optimization formulated. A distributed Multi-Agent Reinforcement Learning (MARL) algorithm based on Win-or-Learn-Fast Policy Hill-Climbing (WOLF-PHC) is proposed, in which a "stateless" modification is introduced to reduce the algorithm complexity significantly. The performance of high energy efficiency is validated in simulations, which also reveal the applicability and convergence of the designed WOLF-PHC based optimization algorithm.
引用
收藏
页数:6
相关论文
共 16 条
[1]  
3GPP Evolved Universal Terrestrial Radio Access (E-UTRA)
[2]  
Radio Access Network (RAN), 2019, 36213 3GPP TS
[3]  
Anton-Haro C., 2016, MACHINE TO MACHINE
[4]   Multiagent learning using a variable learning rate [J].
Bowling, M ;
Veloso, M .
ARTIFICIAL INTELLIGENCE, 2002, 136 (02) :215-250
[5]   NARROWBAND IOT: A SURVEY ON DOWNLINK AND UPLINK PERSPECTIVES [J].
Feltrin, Luca ;
Tsoukaneri, Galini ;
Condoluci, Massimo ;
Buratti, Chiara ;
Mahmoodi, Toktam ;
Dohler, Mischa ;
Verdone, Roberto .
IEEE WIRELESS COMMUNICATIONS, 2019, 26 (01) :78-86
[6]  
Jiang N., 2018, ARXIV181209026
[7]   The Impact of Sleep and Circadian Disturbance on Hormones and Metabolism [J].
Kim, Tae Won ;
Jeong, Jong-Hyun ;
Hong, Seung-Chul .
INTERNATIONAL JOURNAL OF ENDOCRINOLOGY, 2015, 2015
[8]   Is the Random Access Channel of LTE and LTE-A Suitable for M2M Communications? A Survey of Alternatives [J].
Laya, Andres ;
Alonso, Luis ;
Alonso-Zarate, Jesus .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (01) :4-16
[9]  
Lee W., 2012, 2012 IEEE POWER ENER, P1, DOI [10.1109/PESGM.2012.6344981, DOI 10.1109/PESGM.2012.6344981]
[10]   Radio Resource Management Scheme in NB-IoT Systems [J].
Malik, Hassan ;
Pervaiz, Haris ;
Alam, Muhammad Mahtab ;
Le Moullec, Yannick ;
Kuusik, Alar ;
Imran, Muhammad Ali .
IEEE ACCESS, 2018, 6 :15051-15064