Distributed Spectrum Management in Cognitive Radio Networks by Consensus-Based Reinforcement Learning

被引:4
作者
Dasic, Dejan [1 ,2 ,3 ]
Ilic, Nemanja [1 ,4 ]
Vucetic, Miljan [1 ]
Peric, Miroslav [1 ]
Beko, Marko [5 ,6 ]
Stankovic, Milos S. [1 ,2 ]
机构
[1] Vlatacom Inst, Dept Artificial Intelligence, Belgrade 11070, Serbia
[2] Singidunum Univ, Fac Tech Sci, Belgrade 11000, Serbia
[3] Univ Lusofona Humanidades & Tecnol, COPELABS, P-1749024 Lisbon, Portugal
[4] Coll Appl Tech Sci, Dept Informat Technol, Krusevac 37000, Serbia
[5] Univ Lisbon, Inst Super Tecn, Inst Telecomunicacoes, P-1049001 Lisbon, Portugal
[6] Univ Union Nikola Tesla, Fac Informat Technol & Engn, Belgrade 11158, Serbia
关键词
multi-agent reinforcement learning; consensus algorithm; cognitive radio networking; joint spectrum sensing and channel selection; distributed policy evaluation; distributed Q-learning; off-policy temporal difference; COMPREHENSIVE SURVEY; CHANNEL SELECTION; ACCESS; CONVERGENCE; FRAMEWORK;
D O I
10.3390/s21092970
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In this paper, we propose a new algorithm for distributed spectrum sensing and channel selection in cognitive radio networks based on consensus. The algorithm operates within a multi-agent reinforcement learning scheme. The proposed consensus strategy, implemented over a directed, typically sparse, time-varying low-bandwidth communication network, enforces collaboration between the agents in a completely decentralized and distributed way. The motivation for the proposed approach comes directly from typical cognitive radio networks' practical scenarios, where such a decentralized setting and distributed operation is of essential importance. Specifically, the proposed setting provides all the agents, in unknown environmental and application conditions, with viable network-wide information. Hence, a set of participating agents becomes capable of successful calculation of the optimal joint spectrum sensing and channel selection strategy even if the individual agents are not. The proposed algorithm is, by its nature, scalable and robust to node and link failures. The paper presents a detailed discussion and analysis of the algorithm's characteristics, including the effects of denoising, the possibility of organizing coordinated actions, and the convergence rate improvement induced by the consensus scheme. The results of extensive simulations demonstrate the high effectiveness of the proposed algorithm, and that its behavior is close to the centralized scheme even in the case of sparse neighbor-based inter-node communication.
引用
收藏
页数:20
相关论文
共 45 条
[1]   Distributed target tracking in sensor networks using multi-step consensus [J].
Al Ali, Khaled Obaid ;
Ilic, Nemanja ;
Stankovic, Milos S. ;
Stankovic, Srdjan S. .
IET RADAR SONAR AND NAVIGATION, 2018, 12 (09) :998-1004
[2]  
[Anonymous], 2009, P 26 ANN INT C MACH
[3]   A Comprehensive Survey on Spectrum Sensing in Cognitive Radio Networks: Recent Advances, New Challenges, and Future Research Directions [J].
Arjoune, Youness ;
Kaabouch, Naima .
SENSORS, 2019, 19 (01)
[4]   Efficient Beamforming in Cognitive Radio Multicast Transmission [J].
Beko, Marko .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2012, 11 (11) :4108-4117
[5]  
Bhandari J., 2018, C LEARN THEOR COLT, P1691
[6]   Randomized gossip algorithms [J].
Boyd, Stephen ;
Ghosh, Arpita ;
Prabhakar, Balaji ;
Shah, Devavrat .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (06) :2508-2530
[7]   A comprehensive survey of multiagent reinforcement learning [J].
Busoniu, Lucian ;
Babuska, Robert ;
De Schutter, Bart .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02) :156-172
[8]  
Dasic D., 2020, P 10 INT C WEB INT M
[9]  
Di Felice M., 2010, P WIR WIR INT COMM L
[10]  
Felice M. D., 2019, HDB COGNITIVE RADIO, P1849