Adaptive Noise Exploration for Neural Contextual Multi-Armed Bandits

被引:0
|
作者
Wang, Chi [1 ]
Shi, Lin [1 ]
Luo, Junru [1 ]
机构
[1] Changzhou Univ, Sch Comp Sci & Artificial Intelligence, Changzhou 213000, Peoples R China
关键词
multi-armed bandits; exploration and exploitation; adaptive noise exploration;
D O I
10.3390/a18020056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In contextual multi-armed bandits, the relationship between contextual information and rewards is typically unknown, complicating the trade-off between exploration and exploitation. A common approach to address this challenge is the Upper Confidence Bound (UCB) method, which constructs confidence intervals to guide exploration. However, the UCB method becomes computationally expensive in environments with numerous arms and dynamic contexts. This paper presents an adaptive noise exploration framework to reduce computational complexity and introduces two novel algorithms: EAD (Exploring Adaptive Noise in Decision-Making Processes) and EAP (Exploring Adaptive Noise in Parameter Spaces). EAD injects adaptive noise into the reward signals based on arm selection frequency, while EAP adds adaptive noise to the hidden layer of the neural network for more stable exploration. Experimental results on recommendation and classification tasks show that both algorithms significantly surpass traditional linear and neural methods in computational efficiency and overall performance.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Pruning neural networks using multi-armed bandits
    Ameen S.
    Vadera S.
    Computer Journal, 2020, 63 (07): : 1099 - 1108
  • [2] Adaptive Data Depth via Multi-Armed Bandits
    Baharav, Tavor Z.
    Lai, Tze Leung
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [3] Pruning Neural Networks Using Multi-Armed Bandits
    Ameen, Salem
    Vadera, Sunil
    COMPUTER JOURNAL, 2020, 63 (07) : 1099 - 1108
  • [4] Online Residential Demand Response via Contextual Multi-Armed Bandits
    Chen, Xin
    Nie, Yutong
    Li, Na
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (02): : 433 - 438
  • [5] Tsallis-INF for Decoupled Exploration and Exploitation in Multi-armed Bandits
    Rouyer, Chloe
    Seldin, Yevgeny
    CONFERENCE ON LEARNING THEORY, VOL 125, 2020, 125
  • [6] Fast Beam Alignment via Pure Exploration in Multi-Armed Bandits
    Wei, Yi
    Zhong, Zixin
    Tan, Vincent Y. F.
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (05) : 3264 - 3279
  • [7] Multi-armed bandits with dependent arms
    Singh, Rahul
    Liu, Fang
    Sun, Yin
    Shroff, Ness
    MACHINE LEARNING, 2024, 113 (01) : 45 - 71
  • [8] Scheduling for Massive MIMO With Hybrid Precoding Using Contextual Multi-Armed Bandits
    Mauricio, Weskley V. F.
    Maciel, Tarcisio Ferreira
    Klein, Anja
    Marques Lima, Francisco Rafael
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (07) : 7397 - 7413
  • [9] Multi-armed bandits with episode context
    Christopher D. Rosin
    Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
  • [10] Multi-Armed Bandits With Costly Probes
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643