Neighbor Discovery and Selection in Millimeter Wave D2D Networks Using Stochastic MAB

被引：37

作者：

Hashima, Sherief ^{[1
,2
]}

Hatano, Kohei ^{[3
,4
]}

Takimoto, Eiji ^{[5
]}

Mohamed, Ehab Mahmoud ^{[6
,7
]}

机构：

[1] RIKEN, Computat Learning Theory Team, Adv Intelligent Project AIP, Fukuoka 8190395, Japan

[2] Egyptian Atom Energy Author, Engn & Sci Equipments Dept, Cairo 31759, Egypt

[3] Kyushu Univ, Fac Arts & Sci, Fukuoka 8190395, Japan

[4] RIKEN, AIP, Chuo City, Tokyo, Japan

[5] Kyushu Univ, Dept Informat, Fukuoka 8190395, Japan

[6] Prince Sattam Bin Abdulaziz Univ, Coll Engn, Wadi Addwasir 11991, Saudi Arabia

[7] Aswan Univ, Fac Engn, Aswan 81542, Egypt

来源：

IEEE COMMUNICATIONS LETTERS | 2020年 / 24卷 / 08期

基金：

日本学术振兴会;

关键词：

Device-to-device communication; Throughput; 5G mobile communication; Performance evaluation; Network architecture; Training; Shadow mapping; mmWave; device-to device (D2D); multiarmed bandit (MAB); neighbor discovery & selection (NDS);

D O I：

10.1109/LCOMM.2020.2991535

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

The propagation characteristics of millimeter-wave (mmWaves), encourages its use in the device to device (D2D) communications for fifth-generation (5G) and future beyond 5G (B5G) networks. However, due to the use of beamforming training (BT), there is a tradeoff between exploring neighbor devices for best device selection and the required overhead. In this letter, using a tool of machine learning, joint neighbor discovery and selection (NDS) in mmWave D2D networks is formulated as a stochastic budget-constraint multi-armed bandit (MAB) problem. Hence, a modified Thomson sampling (TS) and variants of upper confidence bound (UCB) based algorithms are proposed to address the topic while considering the residual energies of the surrounding devices. Simulation analysis demonstrates the effectiveness of the proposed techniques over the conventional approaches concerning average throughput, energy efficiency, and network lifetime.

引用

页码：1840 / 1844

页数：5

共 10 条

[1] Exploration-exploitation tradeoff using variance estimates in multi-armed bandits [J].