Enhanced Dynamic Spectrum Access in UAV Wireless Networks for Post-Disaster Area Surveillance System: A Multi-Player Multi-Armed Bandit Approach

被引:16
作者
Amrallah, Amr [1 ]
Mohamed, Ehab Mahmoud [2 ,3 ]
Tran, Gia Khanh [1 ]
Sakaguchi, Kei [1 ]
机构
[1] Tokyo Inst Technol, Sch Engn, Dept Elect & Elect Engn, Meguro Ku, Tokyo 1528550, Japan
[2] Prince Sattam Bin Abdulaziz Univ, Coll Engn, Elect Engn Dept, Wadi Addwasir 11991, Saudi Arabia
[3] Aswan Univ, Fac Engn, Elect Engn Dept, Aswan 81542, Egypt
关键词
unmanned aerial vehicles; dynamic spectrum access; quality of service; reinforcement learning; multi-armed bandit; COGNITIVE RADIO NETWORKS; MANAGEMENT; ALGORITHMS; SELECTION;
D O I
10.3390/s21237855
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Modern wireless networks are notorious for being very dense, uncoordinated, and selfish, especially with greedy user needs. This leads to a critical scarcity problem in spectrum resources. The Dynamic Spectrum Access system (DSA) is considered a promising solution for this scarcity problem. With the aid of Unmanned Aerial Vehicles (UAVs), a post-disaster surveillance system is implemented using Cognitive Radio Network (CRN). UAVs are distributed in the disaster area to capture live images of the damaged area and send them to the disaster management center. CRN enables UAVs to utilize a portion of the spectrum of the Electronic Toll Collection (ETC) gates operating in the same area. In this paper, a joint transmission power selection, data-rate maximization, and interference mitigation problem is addressed. Considering all these conflicting parameters, this problem is investigated as a budget-constrained multi-player multi-armed bandit (MAB) problem. The whole process is done in a decentralized manner, where no information is exchanged between UAVs. To achieve this, two power-budget-aware PBA-MAB) algorithms, namely upper confidence bound (PBA-UCB (MAB) algorithm and Thompson sampling (PBA-TS) algorithm, were proposed to realize the selection of the transmission power value efficiently. The proposed PBA-MAB algorithms show outstanding performance over random power value selection in terms of achievable data rate.
引用
收藏
页数:19
相关论文
共 59 条
[1]  
Agrawal S., 2013, 16, P99
[2]   Cognitive Radio Sensor Networks [J].
Akan, Ozgur B. ;
Karli, Osman B. ;
Ergul, Ozgur .
IEEE NETWORK, 2009, 23 (04) :34-40
[3]   A survey on spectrum management in cognitive radio networks [J].
Akyildiz, Ian F. ;
Lee, Won-Yeol ;
Vuran, Mehmet C. ;
Mohanty, Shantidev .
IEEE COMMUNICATIONS MAGAZINE, 2008, 46 (04) :40-48
[4]   Machine Learning Framework for Sensing and Modeling Interference in IoT Frequency Bands [J].
Al Homssi, Bassel ;
Al-Hourani, Akram ;
Krusevac, Zarko ;
Rowe, Wayne S. T. .
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (06) :4461-4471
[5]   Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret [J].
Anandkumar, Animashree ;
Michael, Nithin ;
Tang, Kevin ;
Swami, Ananthram .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2011, 29 (04) :731-745
[6]  
[Anonymous], 2011, Advances in neural information processing systems
[7]  
[Anonymous], 2016, 2016 IEEE 27 ANN INT
[8]  
Arnold P., 2017, 2017 European Conference on Networks and Communications (EuCNC), P1, DOI 10.1109/EuCNC.2017.7980777
[9]   Exploration-exploitation tradeoff using variance estimates in multi-armed bandits [J].
Audibert, Jean-Yves ;
Munos, Remi ;
Szepesvari, Csaba .
THEORETICAL COMPUTER SCIENCE, 2009, 410 (19) :1876-1902
[10]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256