TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES

被引:34
|
作者
Tao, Yebin [1 ]
Wang, Lu [1 ]
Almirall, Daniel [2 ]
机构
[1] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Inst Social Res, Ann Arbor, MI 48109 USA
来源
ANNALS OF APPLIED STATISTICS | 2018年 / 12卷 / 03期
基金
美国国家卫生研究院;
关键词
Multi-stage decision-making; personalized medicine; classification; backward induction; decision tree; TREATMENT STRATEGIES; TREATMENT RULES; OUTCOMES; ILLNESS; MODELS;
D O I
10.1214/18-AOAS1137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Dynamic treatment regimes (DTRs) are sequences of treatment decision rules, in which treatment may be adapted over time in response to the changing course of an individual. Motivated by the substance use disorder (SUD) study, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that directly handles the problem of optimization with multiple treatment comparisons, through a purity measure constructed with augmented inverse probability weighted estimators. For the multiple stages, the algorithm is implemented recursively using backward induction. By combining semiparametric regression with flexible tree-based learning, T-RL is robust, efficient and easy to interpret for the identification of optimal DTRs, as shown in the simulation studies. With the proposed method, we identify dynamic SUD treatment regimes for adolescents.
引用
收藏
页码:1914 / 1938
页数:25
相关论文
共 50 条
  • [1] Multiobjective tree-based reinforcement learning for estimating tolerant dynamic treatment regimes
    Song, Yao
    Wang, Lu
    BIOMETRICS, 2024, 80 (01)
  • [2] Estimating tree-based dynamic treatment regimes using observational data with restricted treatment sequences
    Zhou, Nina
    Wang, Lu
    Almirall, Daniel
    BIOMETRICS, 2023, 79 (03) : 2260 - 2271
  • [3] Comparing Four Methods for Estimating Tree-Based Treatment Regimes
    Sies, Aniek
    Van Mechelen, Iven
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2017, 13 (01):
  • [4] Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes
    Sun, Yilun
    Wang, Lu
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 116 (533) : 421 - 432
  • [5] Tree-based reinforcement learning for optimal water reservoir operation
    Castelletti, A.
    Galelli, S.
    Restelli, M.
    Soncini-Sessa, R.
    WATER RESOURCES RESEARCH, 2010, 46
  • [6] Near-Optimal Reinforcement Learning in Dynamic Treatment Regimes
    Zhang, Junzhe
    Bareinboim, Elias
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] Penalized Spline-Involved Tree-based (PenSIT) Learning for estimating an optimal dynamic treatment regime using observational data
    Speth, Kelly A.
    Elliott, Michael R.
    Marquez, Juan L.
    Wang, Lu
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (12) : 2338 - 2351
  • [8] Step-adjusted tree-based reinforcement learning for evaluating nested dynamic treatment regimes using test-and-treat observational data
    Tang, Ming
    Wang, Lu
    Gorin, Michael A.
    Taylor, Jeremy M. G.
    STATISTICS IN MEDICINE, 2021, 40 (27) : 6164 - 6177
  • [9] Tree-Based Reinforcement Learning for Identifying Optimal Personalized Treatment Decisions for Hand Deformity in Rheumatoid Arthritis
    Yoon, Alfred P.
    Song, Yao
    Lin, I-Chun F.
    Wang, Lu
    Chung, Kevin C.
    PLASTIC AND RECONSTRUCTIVE SURGERY, 2024, 154 (06) : 1259 - 1266
  • [10] Tree-based methods for individualized treatment regimes
    Laber, E. B.
    Zhao, Y. Q.
    BIOMETRIKA, 2015, 102 (03) : 501 - 514