TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES

被引:39
作者
Tao, Yebin [1 ]
Wang, Lu [1 ]
Almirall, Daniel [2 ]
机构
[1] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Inst Social Res, Ann Arbor, MI 48109 USA
基金
美国国家卫生研究院;
关键词
Multi-stage decision-making; personalized medicine; classification; backward induction; decision tree; TREATMENT STRATEGIES; TREATMENT RULES; OUTCOMES; ILLNESS; MODELS;
D O I
10.1214/18-AOAS1137
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Dynamic treatment regimes (DTRs) are sequences of treatment decision rules, in which treatment may be adapted over time in response to the changing course of an individual. Motivated by the substance use disorder (SUD) study, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that directly handles the problem of optimization with multiple treatment comparisons, through a purity measure constructed with augmented inverse probability weighted estimators. For the multiple stages, the algorithm is implemented recursively using backward induction. By combining semiparametric regression with flexible tree-based learning, T-RL is robust, efficient and easy to interpret for the identification of optimal DTRs, as shown in the simulation studies. With the proposed method, we identify dynamic SUD treatment regimes for adolescents.
引用
收藏
页码:1914 / 1938
页数:25
相关论文
共 50 条
[21]   Estimating Expectile-Optimal Treatment Regimes [J].
Fan, Caiyun ;
Li, Siru ;
Xue, Minwei ;
Zhang, Feipeng .
STATISTICS AND COMPUTING, 2025, 35 (05)
[22]   A Robust Method for Estimating Optimal Treatment Regimes [J].
Zhang, Baqun ;
Tsiatis, Anastasios A. ;
Laber, Eric B. ;
Davidian, Marie .
BIOMETRICS, 2012, 68 (04) :1010-1018
[23]   Assessment of Tree-Based Statistical Learning to Estimate Optimal Personalized Treatment Decision Rules for Traumatic Finger Amputations [J].
Speth, Kelly A. ;
Yoon, Alfred P. ;
Wang, Lu ;
Chung, Kevin C. .
JAMA NETWORK OPEN, 2020, 3 (02)
[24]   Accountable survival contrast-learning for optimal dynamic treatment regimes [J].
Choi, Taehwa ;
Lee, Hyunjun ;
Choi, Sangbum .
SCIENTIFIC REPORTS, 2023, 13 (01)
[25]   Wart Treatment Selection with a Decision Tree-Based Approach [J].
Yanik, Huseyin ;
Comert, Mustafa .
2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
[26]   Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning [J].
Luckett, Daniel J. ;
Laber, Eric B. ;
Kahkoska, Anna R. ;
Maahs, David M. ;
Mayer-Davis, Elizabeth ;
Kosorok, Michael R. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) :692-706
[27]   Optimal Dynamic Treatment Regimes and Partial Welfare Ordering [J].
Han, Sukjin .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (547) :2000-2010
[28]   Estimating Colebrook-White Friction Factor Using Tree-Based Machine Learning Models [J].
Niazkar, Majid ;
Menapace, Andrea ;
Righetti, Maurizio .
LATEST ADVANCEMENTS IN MECHANICAL ENGINEERING, VOL 1, ISIEA 2024, 2024, 1124 :270-279
[29]   Synthesizing independent stagewise trials for optimal dynamic treatment regimes [J].
Chen, Yuan ;
Wang, Yuanjia ;
Zeng, Donglin .
STATISTICS IN MEDICINE, 2020, 39 (28) :4107-4119
[30]   VARIABLE SELECTION FOR ESTIMATING THE OPTIMAL TREATMENT REGIMES IN THE PRESENCE OF A LARGE NUMBER OF COVARIATES [J].
Zhang, Baqun ;
Zhang, Min .
ANNALS OF APPLIED STATISTICS, 2018, 12 (04) :2335-2358