Time-Decaying Bandits for Non-stationary Systems

被引:5
|
作者
机构
[1] Komiyama, Junpei
[2] Qin, Tao
来源
Komiyama, Junpei | 1600年 / Springer Verlag卷 / 8877期
关键词
Stochastic systems;
D O I
10.1007/978-3-319-13129-0_40
中图分类号
学科分类号
摘要
Contents displayed on web portals (e.g., news articles at Yahoo.com) are usually adaptively selected from a dynamic set of candidate items, and the attractiveness of each item decays over time. The goal of those websites is to maximize the engagement of users (usually measured by their clicks) on the selected items.We formulate this kind of applications as a new variant of bandit problems where new arms are dynamically added into the candidate set and the expected reward of each arm decays as the round proceeds. For this new problem, a direct application of the algorithms designed for stochastic MAB (e.g., UCB) will lead to over-estimation of the rewards of old arms, and thus cause a misidentification of the optimal arm. To tackle this challenge, we propose a new algorithm that can adaptively estimate the temporal dynamics in the rewards of the arms, and effectively identify the best arm at a given time point on this basis. When the temporal dynamics are represented by a set of features, the proposed algorithm is able to enjoy a sub-linear regret. Our experiments verify the effectiveness of the proposed algorithm. © Springer International Publishing Switzerland 2014.
引用
收藏
相关论文
共 50 条
  • [1] Time-Decaying Bandits for Non-stationary Systems
    Komiyama, Junpei
    Qin, Tao
    WEB AND INTERNET ECONOMICS, 2014, 8877 : 460 - 466
  • [2] Non-stationary Bandits with Knapsacks
    Liu, Shang
    Jiang, Jiashuo
    Li, Xiaocheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Unifying Clustered and Non-stationary Bandits
    Li, Chuanhao
    Wu, Qingyun
    Wang, Hongning
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [4] Non-stationary Bandits with Heavy Tail
    Pan, Weici
    Liu, Zhenhua
    Performance Evaluation Review, 2024, 52 (02): : 33 - 35
  • [5] Cascading Non-Stationary Bandits: Online Learning to Rank in the Non-Stationary Cascade Model
    Li, Chang
    de Rijke, Maarten
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2859 - 2865
  • [6] A Simple Approach for Non-stationary Linear Bandits
    Zhao, Peng
    Zhang, Lijun
    Jiang, Yuan
    Zhou, Zhi-Hua
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 746 - 754
  • [7] Learning Contextual Bandits in a Non-stationary Environment
    Wu, Qingyun
    Iyer, Naveen
    Wang, Hongning
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 495 - 504
  • [8] Weighted Linear Bandits for Non-Stationary Environments
    Russac, Yoan
    Vernade, Claire
    Cappe, Olivier
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Competing Bandits in Non-Stationary Matching Markets
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    Javidi, Tara
    Mazumdar, Arya
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (04) : 2831 - 2850
  • [10] Discrete time representation of stationary and non-stationary continuous time systems
    Chambers, MJ
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 1999, 23 (04): : 619 - 639