MULTI-ARMED BANDITS WITH COVARIATES: THEORY AND APPLICATIONS

被引:2
|
作者
Kim, Dong Woo [1 ]
Lai, Tze Leung [2 ]
Xu, Huanzhong [3 ]
机构
[1] Microsoft Corp, Anal & Expt Team, Redmond, WA 98052 USA
[2] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[3] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Contextual multi-armed bandits; c-greedy randomization; personalized medicine; recommender system; reinforcement learning; INFORMATION; ALLOCATION; REGRESSION; CONVERGENCE; RATES;
D O I
10.5705/ss.202020.0454
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
"Multi-armed bandits" were introduced as a new direction in the thennascent field of sequential analysis, developed during World War II in response to the need for more efficient testing of anti-aircraft gunnery, and later as a concrete application of dynamic programming and optimal control of Markov decision processes. A comprehensive theory that unified both directions emerged in the 1980s, providing important insights and algorithms for diverse applications in many science, technology, engineering and mathematics fields. The turn of the millennium marked the onset of a "personalization revolution," from personalized medicine and online personalized advertising and recommender systems (e.g. Netflix's recommendations for movies and TV shows, Amazon's recommendations for products to purchase, and Microsoft's Matchbox recommender). This has required an extension of classical bandit theory to nonparametric contextual bandits, where "contextual" refers to the incorporation of personal information as covariates. Such theory is developed herein, together with illustrative applications, statistical models, and computational tools for its implementation.
引用
收藏
页码:2275 / 2287
页数:13
相关论文
共 50 条
  • [21] Multi-Armed Bandits with Fairness Constraints for Distributing Resources to Human Teammates
    Claure, Houston
    Chen, Yifang
    Modi, Jignesh
    Jung, Malte
    Nikolaidis, Stefanos
    PROCEEDINGS OF THE 2020 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '20), 2020, : 299 - 308
  • [22] Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits
    Park, Hongju
    Faradonbeh, Mohamad Kazem Shirani
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2150 - 2155
  • [23] A non-parametric solution to the multi-armed bandit problem with covariates
    Ai, Mingyao
    Huang, Yimin
    Yu, Jun
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2021, 211 : 402 - 413
  • [24] Personalized clinical trial based on multi-armed bandit algorithms with covariates
    Shao, Yifei
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024, 2024, : 12 - 17
  • [25] Multi-Armed Bandits for Spectrum Allocation in Multi-Agent Channel Bonding WLANs
    Barrachina-Munoz, Sergio
    Chiumento, Alessandro
    Bellalta, Boris
    IEEE ACCESS, 2021, 9 : 133472 - 133490
  • [26] Multi-Player Multi-Armed Bandits With Collision-Dependent Reward Distributions
    Shi, Chengshuai
    Shen, Cong
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 4385 - 4402
  • [27] IEEE 802.11bn Multi-AP Coordinated Spatial Reuse With Hierarchical Multi-Armed Bandits
    Wojnar, Maksymilian
    Ciezobka, Wojciech
    Kosek-Szott, Katarzyna
    Rusek, Krzysztof
    Szott, Szymon
    Nunez, David
    Bellalta, Boris
    IEEE COMMUNICATIONS LETTERS, 2025, 29 (03) : 428 - 432
  • [28] Decentralized AP selection using Multi-Armed Bandits: Opportunistic ε-Greedy with Stickiness
    Carrascosa, Marc
    Bellalta, Boris
    2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 309 - 315
  • [29] Multi-Armed Bandits for Minesweeper: Profiting From Exploration-Exploitation Synergy
    Lordeiro, Igor Q.
    Haddad, Diego B.
    Cardoso, Douglas O.
    IEEE TRANSACTIONS ON GAMES, 2022, 14 (03) : 403 - 412
  • [30] Multi-Armed Bandits with Endogenous Learning Curves: An Application to Split Liver Transplantation
    Tang, Yanhan
    Li, Andrew
    Scheller-Wolf, Alan
    Tayur, Sridhar
    M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2025, : 640 - 658