Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings

被引:35
|
作者
Bonnefoi, Remi [1 ]
Besson, Lilian [1 ,2 ]
Moy, Christophe [1 ]
Kaufmann, Emilie [2 ]
Palicot, Jacques [1 ]
机构
[1] CentraleSupelec, IETR, SCEE Team, Campus Rennes,Ave Boulaie,CS 47601, F-35576 Cesson Sevigne, France
[2] Univ Lille 1, CNRS, INRIA, SequeL Team,CRIStAL,UMR 9189, F-59000 Lille, France
来源
COGNITIVE RADIO ORIENTED WIRELESS NETWORKS | 2018年 / 228卷
关键词
Internet of Things; Multi-Armed Bandits; Reinforcement learning; Cognitive Radio; Non-stationary bandits;
D O I
10.1007/978-3-319-76207-4_15
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Setting up the future Internet of Things (IoT) networks will require to support more and more communicating devices. We prove that intelligent devices in unlicensed bands can use Multi-Armed Bandit (MAB) learning algorithms to improve resource exploitation. We evaluate the performance of two classical MAB learning algorithms, UCB1 and Thomson Sampling, to handle the decentralized decision-making of Spectrum Access, applied to IoT networks; as well as learning performance with a growing number of intelligent end-devices. We show that using learning algorithms does help to fit more devices in such networks, even when all end-devices are intelligent and are dynamically changing channel. In the studied scenario, stochastic MAB learning provides a up to 16% gain in term of successful transmission probabilities, and has near optimal performance even in non-stationary and non-i.i.d. settings with a majority of intelligent devices.
引用
收藏
页码:173 / 185
页数:13
相关论文
共 50 条
  • [1] Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
    Koulouriotis, D. E.
    Xanthopoulos, A.
    APPLIED MATHEMATICS AND COMPUTATION, 2008, 196 (02) : 913 - 922
  • [2] DYNAMIC SPECTRUM ACCESS WITH NON-STATIONARY MULTI-ARMED BANDIT
    Alaya-Feki, Afef Ben Hadj
    Moulines, Eric
    LeCornec, Alain
    2008 IEEE 9TH WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, VOLS 1 AND 2, 2008, : 416 - 420
  • [3] Bio-Inspired Meta-Learning for Active Exploration During Non-Stationary Multi-Armed Bandit Tasks
    Velentzas, George
    Tzafestas, Costas
    Khamassi, Mehdi
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 661 - 669
  • [4] Distributed Learning in Multi-Armed Bandit With Multiple Players
    Liu, Keqin
    Zhao, Qing
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2010, 58 (11) : 5667 - 5681
  • [5] Transfer Learning in Multi-Armed Bandit: A Causal Approach
    Zhang, Junzhe
    Bareinboim, Elias
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1778 - 1780
  • [6] Multi-agent Multi-armed Bandit Learning for Content Caching in Edge Networks
    Su, Lina
    Zhou, Ruiting
    Wang, Ne
    Chen, Junmei
    Li, Zongpeng
    2022 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2022), 2022, : 11 - 16
  • [7] Multi-Agent Multi-Armed Bandit Learning for Grant-Free Access in Ultra-Dense IoT Networks
    Raza, Muhammad Ahmad
    Abolhasan, Mehran
    Lipman, Justin
    Shariati, Negin
    Ni, Wei
    Jamalipour, Abbas
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2024, 10 (04) : 1356 - 1370
  • [8] Learning State Selection for Reconfigurable Antennas: A Multi-Armed Bandit Approach
    Gulati, Nikhil
    Dandekar, Kapil R.
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2014, 62 (03) : 1027 - 1038
  • [9] A Hybrid Proactive Caching System in Vehicular Networks Based on Contextual Multi-Armed Bandit Learning
    Wang, Qiao
    Grace, David
    IEEE ACCESS, 2023, 11 : 29074 - 29090
  • [10] Multi-objective Game Learning Algorithm Based on Multi-armed Bandit in Underwater Acoustic Communication Networks
    Wang, Hui
    Yang, Liejun
    SENSORS AND MATERIALS, 2023, 35 (05) : 1619 - 1630