A Technical Note on Non-Stationary Parametric Bandits: Existing Mistakes and Preliminary Solutions

被引:0
|
作者
Faury, Louis [1 ,2 ]
Russac, Yoan [3 ,4 ,5 ,6 ]
Abeille, Marc [2 ]
Calauzenes, Clement [2 ]
机构
[1] LTCI TelecomParis, Paris, France
[2] Criteo AI Lab, Paris, France
[3] ENS Paris, Paris, France
[4] Univ PSL, Paris, France
[5] CNRS, Paris, France
[6] INRIA, Lille, France
来源
关键词
Stochastic Bandits; Generalized Linear Model; Non-Stationarity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this note(1) we identify several mistakes appearing in the existing literature on non-stationary parametric bandits. More precisely, we study Generalized Linear Bandits (GLBs) in drifting environments, where the level of non-stationarity is characterized by a general metric known as the variation-budget. Existing methods to solve such problems typically involve forgetting mechanisms, which allow for a fine balance between the learning and tracking requirements of the problem. We uncover two significant mistakes in their theoretical analysis. The first arises when bounding the tracking error suffered by forgetting mechanisms. The second emerges when considering non-linear reward models, which requires extra care to balance the learning and tracking guarantees. We introduce a geometrical assumption on the arm set, sufficient to overcome the aforementioned technical gaps and recover minimax-optimality. We also share preliminary attempts at fixing those gaps under general configurations. Unfortunately, our solution yields degraded rates (w.r.t to the horizon), which raises new open questions regarding the optimality of forgetting mechanisms in non-stationary parametric bandits.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Parametric modelling of non-stationary signals: A unified approach
    Mukhopadhyay, S
    Sircar, P
    SIGNAL PROCESSING, 1997, 60 (02) : 135 - 152
  • [32] Non-stationary parametric spectral estimation for ultrasound attenuation
    Girault, JM
    Ossant, F
    Ouahabi, A
    Guittet, C
    Kouame, D
    Patat, F
    ACOUSTICAL IMAGING, VOL 23, 1997, 23 : 53 - 59
  • [33] Structural and parametric non-stationary modal control systems
    Fedosenkov, D. B.
    Simikova, A. A.
    Fedosenkov, B. A.
    XII ALL-RUSSIAN SCIENTIFIC AND PRACTICAL CONFERENCE (WITH INTERNATIONAL PARTICIPATION) ON AUTOMATION SYSTEMS IN EDUCATION, SCIENCE AND PRODUCTION, 2019, 2020, 865
  • [34] Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information
    Auer, Peter
    Chen, Yifang
    Gajane, Pratik
    Lee, Chung-Wei
    Luo, Haipeng
    Ortner, Ronald
    Wei, Chen-Yu
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [35] Stationary and non-stationary solutions of the evolution equation for neutrino in matter
    Chukhnova, A. V.
    Lobanov, A. E.
    XXTH INTERNATIONAL SEMINAR ON HIGH ENERGY PHYSICS (QUARKS-2018), 2018, 191
  • [36] ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits
    Buening, Thomas Kleine
    Saha, Aadirupa
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [37] A Change-Detection-Based Thompson Sampling Framework for Non-Stationary Bandits
    Ghatak, Gourab
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (10) : 1670 - 1676
  • [38] Contextual Multi-Armed Bandits for Non-Stationary Wireless Network Selection
    Martinez, Lluis
    Vidal, Josep
    Cabrera-Bean, Margarita
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 285 - 290
  • [39] Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
    Papadigenopoulos, Orestis
    Caramanis, Constantine
    Shakkottai, Sanjay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [40] Some algorithms for correlated bandits with non-stationary rewards : Regret bounds and applications
    Mayekar, Prathamesh
    Hemachandra, Nandyala
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,