Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings

被引:35
|
作者
Bonnefoi, Remi [1 ]
Besson, Lilian [1 ,2 ]
Moy, Christophe [1 ]
Kaufmann, Emilie [2 ]
Palicot, Jacques [1 ]
机构
[1] CentraleSupelec, IETR, SCEE Team, Campus Rennes,Ave Boulaie,CS 47601, F-35576 Cesson Sevigne, France
[2] Univ Lille 1, CNRS, INRIA, SequeL Team,CRIStAL,UMR 9189, F-59000 Lille, France
关键词
Internet of Things; Multi-Armed Bandits; Reinforcement learning; Cognitive Radio; Non-stationary bandits;
D O I
10.1007/978-3-319-76207-4_15
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Setting up the future Internet of Things (IoT) networks will require to support more and more communicating devices. We prove that intelligent devices in unlicensed bands can use Multi-Armed Bandit (MAB) learning algorithms to improve resource exploitation. We evaluate the performance of two classical MAB learning algorithms, UCB1 and Thomson Sampling, to handle the decentralized decision-making of Spectrum Access, applied to IoT networks; as well as learning performance with a growing number of intelligent end-devices. We show that using learning algorithms does help to fit more devices in such networks, even when all end-devices are intelligent and are dynamically changing channel. In the studied scenario, stochastic MAB learning provides a up to 16% gain in term of successful transmission probabilities, and has near optimal performance even in non-stationary and non-i.i.d. settings with a majority of intelligent devices.
引用
收藏
页码:173 / 185
页数:13
相关论文
共 50 条
  • [1] Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
    Koulouriotis, D. E.
    Xanthopoulos, A.
    APPLIED MATHEMATICS AND COMPUTATION, 2008, 196 (02) : 913 - 922
  • [2] The non-stationary stochastic multi-armed bandit problem
    Allesiardo R.
    Féraud R.
    Maillard O.-A.
    Allesiardo, Robin (robin.allesiardo@gmail.com), 1600, Springer Science and Business Media Deutschland GmbH (03): : 267 - 283
  • [3] DYNAMIC SPECTRUM ACCESS WITH NON-STATIONARY MULTI-ARMED BANDIT
    Alaya-Feki, Afef Ben Hadj
    Moulines, Eric
    LeCornec, Alain
    2008 IEEE 9TH WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS, VOLS 1 AND 2, 2008, : 416 - 420
  • [4] Contextual Multi-Armed Bandit With Costly Feature Observation in Non-Stationary Environments
    Ghoorchian, Saeed
    Kortukov, Evgenii
    Maghsudi, Setareh
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 820 - 830
  • [5] LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments
    de Curto, J.
    de Zarza, I.
    Roig, Gemma
    Cano, Juan Carlos
    Manzoni, Pietro
    Calafate, Carlos T.
    ELECTRONICS, 2023, 12 (13)
  • [7] Learning the Truth in Social Networks Using Multi-Armed Bandit
    Odeyomi, Olusola T.
    IEEE ACCESS, 2020, 8 : 137692 - 137701
  • [8] Bio-Inspired Meta-Learning for Active Exploration During Non-Stationary Multi-Armed Bandit Tasks
    Velentzas, George
    Tzafestas, Costas
    Khamassi, Mehdi
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 661 - 669
  • [9] Active Learning on Heterogeneous Information Networks: A Multi-armed Bandit Approach
    Xin, Doris
    El-Kishky, Ahmed
    Liao, De
    Norick, Brandon
    Han, Jiawei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1350 - 1355
  • [10] Adaptive Active Learning as a Multi-armed Bandit Problem
    Czarnecki, Wojciech M.
    Podolak, Igor T.
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 989 - 990