Aggregation of Multi-Armed Bandits Learning Algorithms for Opportunistic Spectrum Access

被引：0

作者：

Besson, Lilian ^{[1
,2
,3
]}

Kaufmann, Emilie ^{[2
,3
]}

Moy, Christophe ^{[4
]}

机构：

[1] IETR SCEE, Cent Supelec, Cesson Sevigne, France

[2] Univ Lille 1, Inria SequeL, CNRS, Lille, France

[3] Univ Lille 1, Inria SequeL, CRIStAL, Lille, France

[4] Univ Rennes, CNRS, IETR UMR 6164, Rennes, France

来源：

2018 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC) | 2018年

关键词：

cognitive radio; learning theory; robust aggregation algorithms; multi-armed bandits; reinforcement learning; COGNITIVE RADIO;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-armed bandit algorithms have been recently studied and evaluated for Cognitive Radio (CR), especially in the context of Opportunistic Spectrum Access (OSA). Several solutions have been explored based on various models, but it is hard to exactly predict which could be the best for real-world conditions at every instants. Hence, expert aggregation algorithms can be useful to select on the run the best algorithm for a specific situation. Aggregation algorithms, such as Exp4 dating back from 2002, have never been used for OSA learning, and we show that it appears empirically sub-efficient when applied to simple stochastic problems. In this article, we present an improved variant, called Aggregator. For synthetic OSA problems modeled as Multi-Armed Bandit (MAB) problems, simulation results are presented to demonstrate its empirical efficiency. We combine classical algorithms, such as Thompson sampling, Upper-Confidence Bounds algorithms (UCB and variants), and Bayesian or Kullback-Leibler UCB. Our algorithm offers good performance compared to state-of-the-art algorithms (Exp4, CORRAL or LearnExp), and appears as a robust approach to select on the run the best algorithm for any stochastic MAB problem, being more realistic to real-world radio settings than any tuning-based approach.

引用

页数：6

共 50 条

[1] Multi-User Multi-Armed Bandits for Uncoordinated Spectrum Access
Bande, Meghana
Veeravalli, Venugopal V.
2019 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2019, : 653 - 657
[2] Active Learning in Multi-armed Bandits
Antos, Andras
Grover, Varun
Szepesvari, Csaba
ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2008, 5254 : 287 - +
[3] Quantum greedy algorithms for multi-armed bandits
Hiroshi Ohno
Quantum Information Processing, 22
[4] Algorithms for Differentially Private Multi-Armed Bandits
Tossou, Aristide C. Y.
Dimitrakakis, Christos
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2087 - 2093
[5] Quantum Exploration Algorithms for Multi-Armed Bandits
Wang, Daochen
You, Xuchen
Li, Tongyang
Childs, Andrew M.
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10102 - 10110
[6] Optimal Algorithms for Multiplayer Multi-Armed Bandits
Wang, Po-An
Proutiere, Alexandre
Ariu, Kaito
Jedra, Yassir
Russo, Alessio
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
[7] Optimal Streaming Algorithms for Multi-Armed Bandits
Jin, Tianyuan
Huang, Keke
Tang, Jing
Xiao, Xiaokui
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[8] Quantum greedy algorithms for multi-armed bandits
Ohno, Hiroshi
QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
[9] Opportunistic Spectrum Access Based on a Constrained Multi-Armed Bandit Formulation
Ai, Jing
Abouzeid, Alhussein A.
JOURNAL OF COMMUNICATIONS AND NETWORKS, 2009, 11 (02) : 134 - 147
[10] Quantum Reinforcement Learning for Multi-Armed Bandits
Liu, Yi-Pei
Li, Kuo
Cao, Xi
Jia, Qing-Shan
Wang, Xu
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 5675 - 5680

← 1 2 3 4 5 →