A Technical Note on Non-Stationary Parametric Bandits: Existing Mistakes and Preliminary Solutions

被引:0
|
作者
Faury, Louis [1 ,2 ]
Russac, Yoan [3 ,4 ,5 ,6 ]
Abeille, Marc [2 ]
Calauzenes, Clement [2 ]
机构
[1] LTCI TelecomParis, Paris, France
[2] Criteo AI Lab, Paris, France
[3] ENS Paris, Paris, France
[4] Univ PSL, Paris, France
[5] CNRS, Paris, France
[6] INRIA, Lille, France
来源
关键词
Stochastic Bandits; Generalized Linear Model; Non-Stationarity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this note(1) we identify several mistakes appearing in the existing literature on non-stationary parametric bandits. More precisely, we study Generalized Linear Bandits (GLBs) in drifting environments, where the level of non-stationarity is characterized by a general metric known as the variation-budget. Existing methods to solve such problems typically involve forgetting mechanisms, which allow for a fine balance between the learning and tracking requirements of the problem. We uncover two significant mistakes in their theoretical analysis. The first arises when bounding the tracking error suffered by forgetting mechanisms. The second emerges when considering non-linear reward models, which requires extra care to balance the learning and tracking guarantees. We introduce a geometrical assumption on the arm set, sufficient to overcome the aforementioned technical gaps and recover minimax-optimality. We also share preliminary attempts at fixing those gaps under general configurations. Unfortunately, our solution yields degraded rates (w.r.t to the horizon), which raises new open questions regarding the optimality of forgetting mechanisms in non-stationary parametric bandits.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Non-Stationary Bandits with Auto-Regressive Temporal Dependency
    Chen, Qinyi
    Golrezaei, Negin
    Bouneffouf, Djallel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
    Abbasi-Yadkori, Yasin
    Gyorgy, Andraes
    Lazic, Nevena
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [23] Stochastic Bandits With Non-Stationary Rewards: Reward Attack and Defense
    Yang, Chenye
    Liu, Guanlin
    Lai, Lifeng
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 5007 - 5020
  • [24] Non-stationary non-parametric volatility model
    Han, Heejoon
    Zhang, Shen
    ECONOMETRICS JOURNAL, 2012, 15 (02): : 204 - 225
  • [25] NON-STATIONARY SQUEEZING IN A PARAMETRIC-AMPLIFIER
    EKERT, A
    KNIGHT, PL
    OPTICS COMMUNICATIONS, 1989, 71 (1-2) : 107 - 112
  • [26] NON-STATIONARY FLUXES OF VISCOELASTIC SOLUTIONS
    Hernandez Garcia, Anier
    Sotolongo Costa, Oscar
    REVISTA CUBANA DE FISICA, 2011, 28 (1E): : 1E32 - 1E36
  • [27] Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
    Saha, Aadirupa
    Gupta, Shubham
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 19027 - 19049
  • [28] Non-stationary Continuum-armed Bandits for Online Hyperparameter Optimization
    Lu, Shiyin
    Zhou, Yu-Hang
    Shi, Jing-Cheng
    Zhu, Wenya
    Yu, Qingtao
    Chen, Qing-Guo
    Da, Qing
    Zhang, Lijun
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 618 - 627
  • [29] A NOTE ON NON-STATIONARY SUPERSONIC WING THEORY
    STEWART, HJ
    PHYSICAL REVIEW, 1949, 76 (06): : 882 - 882
  • [30] A non-parametric non-stationary procedure for failure prediction
    Pfefferman, JD
    Cemuschi-Frías, B
    IEEE TRANSACTIONS ON RELIABILITY, 2002, 51 (04) : 434 - 442