Time-Decaying Bandits for Non-stationary Systems

被引:5
|
作者
机构
[1] Komiyama, Junpei
[2] Qin, Tao
来源
Komiyama, Junpei | 1600年 / Springer Verlag卷 / 8877期
关键词
Stochastic systems;
D O I
10.1007/978-3-319-13129-0_40
中图分类号
学科分类号
摘要
Contents displayed on web portals (e.g., news articles at Yahoo.com) are usually adaptively selected from a dynamic set of candidate items, and the attractiveness of each item decays over time. The goal of those websites is to maximize the engagement of users (usually measured by their clicks) on the selected items.We formulate this kind of applications as a new variant of bandit problems where new arms are dynamically added into the candidate set and the expected reward of each arm decays as the round proceeds. For this new problem, a direct application of the algorithms designed for stochastic MAB (e.g., UCB) will lead to over-estimation of the rewards of old arms, and thus cause a misidentification of the optimal arm. To tackle this challenge, we propose a new algorithm that can adaptively estimate the temporal dynamics in the rewards of the arms, and effectively identify the best arm at a given time point on this basis. When the temporal dynamics are represented by a set of features, the proposed algorithm is able to enjoy a sub-linear regret. Our experiments verify the effectiveness of the proposed algorithm. © Springer International Publishing Switzerland 2014.
引用
收藏
相关论文
共 50 条
  • [31] QUANTUM INVERSE SCATTERING FOR TIME-DECAYING HARMONIC OSCILLATORS
    Ishida, Atsuhide
    INVERSE PROBLEMS AND IMAGING, 2025, 19 (02) : 282 - 296
  • [32] Strichartz estimates for harmonic potential with time-decaying coefficient
    Masaki Kawamoto
    Taisuke Yoneyama
    Journal of Evolution Equations, 2018, 18 : 127 - 142
  • [33] Exact smoothing for stationary and non-stationary time series
    Casals, J
    Jerez, M
    Sotoca, S
    INTERNATIONAL JOURNAL OF FORECASTING, 2000, 16 (01) : 59 - 69
  • [34] Classification of non-stationary time series
    Krzemieniewska, Karolina
    Eckley, Idris A.
    Fearnhead, Paul
    STAT, 2014, 3 (01): : 144 - 157
  • [35] Lorentzian non-stationary dynamical systems
    Molaei, Mohammad Reza
    Khajoei, Najmeh
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2022, 25 (08) : 2127 - 2140
  • [36] Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information
    Auer, Peter
    Chen, Yifang
    Gajane, Pratik
    Lee, Chung-Wei
    Luo, Haipeng
    Ortner, Ronald
    Wei, Chen-Yu
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [37] ANACONDA: An Improved Dynamic Regret Algorithm for Adaptive Non-Stationary Dueling Bandits
    Buening, Thomas Kleine
    Saha, Aadirupa
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [38] Non-Stationary Abstract Friedrichs Systems
    Krešimir Burazin
    Marko Erceg
    Mediterranean Journal of Mathematics, 2016, 13 : 3777 - 3796
  • [39] ON SENSITIVITY IN MULTIVARIATE NON-STATIONARY SYSTEMS
    PORTER, WA
    INTERNATIONAL JOURNAL OF CONTROL, 1968, 7 (05) : 481 - &
  • [40] A Change-Detection-Based Thompson Sampling Framework for Non-Stationary Bandits
    Ghatak, Gourab
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (10) : 1670 - 1676