A Technical Note on Non-Stationary Parametric Bandits: Existing Mistakes and Preliminary Solutions

被引:0
|
作者
Faury, Louis [1 ,2 ]
Russac, Yoan [3 ,4 ,5 ,6 ]
Abeille, Marc [2 ]
Calauzenes, Clement [2 ]
机构
[1] LTCI TelecomParis, Paris, France
[2] Criteo AI Lab, Paris, France
[3] ENS Paris, Paris, France
[4] Univ PSL, Paris, France
[5] CNRS, Paris, France
[6] INRIA, Lille, France
来源
关键词
Stochastic Bandits; Generalized Linear Model; Non-Stationarity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this note(1) we identify several mistakes appearing in the existing literature on non-stationary parametric bandits. More precisely, we study Generalized Linear Bandits (GLBs) in drifting environments, where the level of non-stationarity is characterized by a general metric known as the variation-budget. Existing methods to solve such problems typically involve forgetting mechanisms, which allow for a fine balance between the learning and tracking requirements of the problem. We uncover two significant mistakes in their theoretical analysis. The first arises when bounding the tracking error suffered by forgetting mechanisms. The second emerges when considering non-linear reward models, which requires extra care to balance the learning and tracking guarantees. We introduce a geometrical assumption on the arm set, sufficient to overcome the aforementioned technical gaps and recover minimax-optimality. We also share preliminary attempts at fixing those gaps under general configurations. Unfortunately, our solution yields degraded rates (w.r.t to the horizon), which raises new open questions regarding the optimality of forgetting mechanisms in non-stationary parametric bandits.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Non-stationary Bandits with Knapsacks
    Liu, Shang
    Jiang, Jiashuo
    Li, Xiaocheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] Unifying Clustered and Non-stationary Bandits
    Li, Chuanhao
    Wu, Qingyun
    Wang, Hongning
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [3] Non-stationary Bandits with Heavy Tail
    Pan, Weici
    Liu, Zhenhua
    Performance Evaluation Review, 2024, 52 (02): : 33 - 35
  • [4] Cascading Non-Stationary Bandits: Online Learning to Rank in the Non-Stationary Cascade Model
    Li, Chang
    de Rijke, Maarten
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2859 - 2865
  • [5] A Simple Approach for Non-stationary Linear Bandits
    Zhao, Peng
    Zhang, Lijun
    Jiang, Yuan
    Zhou, Zhi-Hua
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 746 - 754
  • [6] Learning Contextual Bandits in a Non-stationary Environment
    Wu, Qingyun
    Iyer, Naveen
    Wang, Hongning
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 495 - 504
  • [7] Weighted Linear Bandits for Non-Stationary Environments
    Russac, Yoan
    Vernade, Claire
    Cappe, Olivier
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] Competing Bandits in Non-Stationary Matching Markets
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    Javidi, Tara
    Mazumdar, Arya
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (04) : 2831 - 2850
  • [9] NON-STATIONARY PARAMETRIC OSCILLATIONS
    NAYFEH, AH
    ASFAR, KR
    JOURNAL OF SOUND AND VIBRATION, 1988, 124 (03) : 529 - 537
  • [10] Weighted Gaussian Process Bandits for Non-stationary Environments
    Deng, Yuntian
    Zhou, Xingyu
    Kim, Baekjin
    Tewari, Ambuj
    Gupta, Abhishek
    Shroff, Ness
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151