Adaptive Noise Exploration for Neural Contextual Multi-Armed Bandits

被引:0
|
作者
Wang, Chi [1 ]
Shi, Lin [1 ]
Luo, Junru [1 ]
机构
[1] Changzhou Univ, Sch Comp Sci & Artificial Intelligence, Changzhou 213000, Peoples R China
关键词
multi-armed bandits; exploration and exploitation; adaptive noise exploration;
D O I
10.3390/a18020056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In contextual multi-armed bandits, the relationship between contextual information and rewards is typically unknown, complicating the trade-off between exploration and exploitation. A common approach to address this challenge is the Upper Confidence Bound (UCB) method, which constructs confidence intervals to guide exploration. However, the UCB method becomes computationally expensive in environments with numerous arms and dynamic contexts. This paper presents an adaptive noise exploration framework to reduce computational complexity and introduces two novel algorithms: EAD (Exploring Adaptive Noise in Decision-Making Processes) and EAP (Exploring Adaptive Noise in Parameter Spaces). EAD injects adaptive noise into the reward signals based on arm selection frequency, while EAP adds adaptive noise to the hidden layer of the neural network for more stable exploration. Experimental results on recommendation and classification tasks show that both algorithms significantly surpass traditional linear and neural methods in computational efficiency and overall performance.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] An empirical evaluation of active inference in multi-armed bandits
    Markovic, Dimitrije
    Stojic, Hrvoje
    Schwoebel, Sarah
    Kiebel, Stefan J.
    NEURAL NETWORKS, 2021, 144 : 229 - 246
  • [22] On Federated Multi-Armed Bandits for Mobile Social Networks
    Sakai, Kazuya
    Kitamura, Takeshi
    Sun, Min-Te
    Ku, Wei-Shinn
    2024 IEEE 44TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS 2024, 2024, : 774 - 784
  • [23] Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling
    Lin, Baihan
    2022 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2022,
  • [24] Multi-Armed Bandits With Self-Information Rewards
    Weinberger, Nir
    Yemini, Michal
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (11) : 7160 - 7184
  • [25] Bounded Regret for Finitely Parameterized Multi-Armed Bandits
    Panaganti, Kishan
    Kalathil, Dileep
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (03): : 1073 - 1078
  • [26] Secure Best Arm Identification in Multi-armed Bandits
    Ciucanu, Radu
    Lafourcade, Pascal
    Lombard-Platet, Marius
    Soare, Marta
    INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2019, 2019, 11879 : 152 - 171
  • [27] Budgeted Multi-Armed Bandits with Asymmetric Confidence Intervals
    Heyden, Marco
    Arzamasov, Vadim
    Fouche, Edouard
    Boehm, Klemens
    PROCEEDINGS OF THE 30TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2024, 2024, : 1073 - 1084
  • [28] CORRELATED MULTI-ARMED BANDITS WITH A LATENT RANDOM SOURCE
    Gupta, Samarth
    Joshi, Gauri
    Yagan, Osman
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3572 - 3576
  • [29] Exploration with Limited Memory: Streaming Algorithms for Coin Tossing, Noisy Comparisons, and Multi-armed Bandits
    Assadi, Sepehr
    Wang, Chen
    PROCEEDINGS OF THE 52ND ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING (STOC '20), 2020, : 1237 - 1250
  • [30] Infinite Horizon Multi-armed Bandits with Reward Vectors: Exploration/Exploitation Trade-off
    Drugan, Madalina M.
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2015, 2015, 9494 : 128 - 144