Neighbor Discovery and Selection in Millimeter Wave D2D Networks Using Stochastic MAB

被引:37
作者
Hashima, Sherief [1 ,2 ]
Hatano, Kohei [3 ,4 ]
Takimoto, Eiji [5 ]
Mohamed, Ehab Mahmoud [6 ,7 ]
机构
[1] RIKEN, Computat Learning Theory Team, Adv Intelligent Project AIP, Fukuoka 8190395, Japan
[2] Egyptian Atom Energy Author, Engn & Sci Equipments Dept, Cairo 31759, Egypt
[3] Kyushu Univ, Fac Arts & Sci, Fukuoka 8190395, Japan
[4] RIKEN, AIP, Chuo City, Tokyo, Japan
[5] Kyushu Univ, Dept Informat, Fukuoka 8190395, Japan
[6] Prince Sattam Bin Abdulaziz Univ, Coll Engn, Wadi Addwasir 11991, Saudi Arabia
[7] Aswan Univ, Fac Engn, Aswan 81542, Egypt
基金
日本学术振兴会;
关键词
Device-to-device communication; Throughput; 5G mobile communication; Performance evaluation; Network architecture; Training; Shadow mapping; mmWave; device-to device (D2D); multiarmed bandit (MAB); neighbor discovery & selection (NDS);
D O I
10.1109/LCOMM.2020.2991535
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The propagation characteristics of millimeter-wave (mmWaves), encourages its use in the device to device (D2D) communications for fifth-generation (5G) and future beyond 5G (B5G) networks. However, due to the use of beamforming training (BT), there is a tradeoff between exploring neighbor devices for best device selection and the required overhead. In this letter, using a tool of machine learning, joint neighbor discovery and selection (NDS) in mmWave D2D networks is formulated as a stochastic budget-constraint multi-armed bandit (MAB) problem. Hence, a modified Thomson sampling (TS) and variants of upper confidence bound (UCB) based algorithms are proposed to address the topic while considering the residual energies of the surrounding devices. Simulation analysis demonstrates the effectiveness of the proposed techniques over the conventional approaches concerning average throughput, energy efficiency, and network lifetime.
引用
收藏
页码:1840 / 1844
页数:5
相关论文
共 10 条
[1]   Exploration-exploitation tradeoff using variance estimates in multi-armed bandits [J].
Audibert, Jean-Yves ;
Munos, Remi ;
Szepesvari, Csaba .
THEORETICAL COMPUTER SCIENCE, 2009, 410 (19) :1876-1902
[2]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[3]   A comparison between UCB and UCB-Tuned as selection policies in GGP [J].
Francisco-Valencia, Ivan ;
Raymundo Marcial-Romero, Jose ;
Maria Valdovinos-Rosas, Rosa .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (05) :5073-5079
[4]  
Garivier A., 2011, P COLT, P359
[5]  
Kaufmann Emilie, 2012, Algorithmic Learning Theory. 23rd International Conference (ALT 2012). Proceedings, P199, DOI 10.1007/978-3-642-34106-9_18
[6]   An Efficient Paradigm for Multiband WiGig D2D Networks [J].
Mohamed, Ehab Mahmoud ;
Abdelghany, Mahmoud Ahmed ;
Zareei, Mahdi .
IEEE ACCESS, 2019, 7 :70032-70045
[7]   Machine Learning for 5G/B5G Mobile and Wireless Communications: Potential, Limitations, and Future Directions [J].
Morocho-Cayamcela, Manuel Eugenio ;
Lee, Haeyoung ;
Lim, Wansu .
IEEE ACCESS, 2019, 7 :137184-137206
[8]   Enabling Device-to-Device Communications in Millimeter-Wave 5G Cellular Networks [J].
Qiao, Jian ;
Shen, Xuemin ;
Mark, Jon W. ;
Shen, Qinghua ;
He, Yejun ;
Lei, Lei .
IEEE COMMUNICATIONS MAGAZINE, 2015, 53 (01) :209-215
[9]  
Vora A., 2018, P IEEE 88 VEH TECH C, P1
[10]   Collaborative Spatial Reuse in wireless networks via selfish Multi-Armed Bandits [J].
Wilhelmi, Francesc ;
Cano, Cristina ;
Neu, Gergely ;
Bellalta, Boris ;
Jonsson, Anders ;
Barrachina-Munoz, Sergio .
AD HOC NETWORKS, 2019, 88 :129-141