A Risk-Averse Framework for Non-Stationary Stochastic Multi-Armed Bandits

被引:0
|
作者
Alami, Reda [1 ]
Mahfoud, Mohammed [2 ]
Achab, Mastane [1 ]
机构
[1] Technol Innovat Inst, Masdar City, U Arab Emirates
[2] Montreal Inst Learning Algorithms, Montreal, PQ, Canada
关键词
Non-stationary environments; risk averse bandits; change point detection;
D O I
10.1109/ICDMW60847.2023.00040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a typical stochastic multi-armed bandit problem, the objective is often to maximize the expected sum of rewards over some time horizon T. While the choice of a strategy that accomplishes that is optimal with no additional information, it is no longer the case when provided additional environmentspecific knowledge. In particular, in areas of high volatility like healthcare or finance, a naive reward maximization approach often does not accurately capture the complexity of the learning problem and results in unreliable solutions. To tackle problems of this nature, we propose a framework of adaptive riskaware strategies that operate in non-stationary environments. Our framework incorporates various risk measures prevalent in the literature to map multiple families of multi-armed bandit algorithms into a risk-sensitive setting. In addition, we equip the resulting algorithms with the Restarted Bayesian Online ChangePoint Detection (R-BOCPD) algorithm and impose a (tunable) forced exploration strategy to detect local (per-arm) switches. We provide finite-time theoretical guarantees and an asymptotic regret bound of (O) over tilde (root KTT) up to time horizon T with K-T the total number of change-points. In practice, our framework compares favorably to the state-of-the-art in both synthetic and real-world environments and manages to perform efficiently with respect to both risk-sensitivity and non-stationarity.
引用
收藏
页码:272 / 280
页数:9
相关论文
共 50 条
  • [31] PAC models in stochastic multi-objective multi-armed bandits
    Drugan, Madalina M.
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 409 - 416
  • [32] Multi-armed Bandits with Probing
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    2024 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2024, 2024, : 2080 - 2085
  • [33] Ballooning multi-armed bandits
    Ghalme, Ganesh
    Dhamal, Swapnil
    Jain, Shweta
    Gujar, Sujit
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2021, 296
  • [34] Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits
    Xiong, Guojun
    Li, Jian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10528 - 10536
  • [35] Perturbed-History Exploration in Stochastic Multi-Armed Bandits
    Kveton, Branislav
    Szepesvari, Csaba
    Ghavamzadeh, Mohammad
    Boutilier, Craig
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2786 - 2793
  • [36] Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
    Koulouriotis, D. E.
    Xanthopoulos, A.
    APPLIED MATHEMATICS AND COMPUTATION, 2008, 196 (02) : 913 - 922
  • [37] Contextual Multi-Armed Bandit With Costly Feature Observation in Non-Stationary Environments
    Ghoorchian, Saeed
    Kortukov, Evgenii
    Maghsudi, Setareh
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 820 - 830
  • [38] Residential HVAC Aggregation Based on Risk-averse Multi-armed Bandit Learning for Secondary Frequency Regulation
    Chen, Xinyi
    Hu, Qinran
    Shi, Qingxin
    Quan, Xiangjun
    Wu, Zaijun
    Li, Fangxing
    JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, 2020, 8 (06) : 1160 - 1167
  • [39] SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits
    Ciucanu, Radu
    Lafourcade, Pascal
    Marcadet, Gael
    Soare, Marta
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6863 - 6867
  • [40] SAMBA: A Generic Framework for Secure Federated Multi-Armed Bandits
    Ciucanu, Radu
    Lafourcade, Pascal
    Marcadet, Gael
    Soare, Marta
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 73 : 737 - 765