Robust Q-Learning

被引：18

作者：

Ertefaie, Ashkan ^{[1
]}

McKay, James R. ^{[2
]}

Oslin, David ^{[3
,4
,5
]}

Strawderman, Robert L. ^{[1
]}

机构：

[1] Univ Rochester, Dept Biostat & Computat Biol, 265 Crittenden Blvd,CU 420630, Rochester, NY 14642 USA

[2] Univ Penn, Dept Psychiat, Ctr Continuum Care Addict, Philadelphia, PA 19104 USA

[3] Univ Penn, Philadelphia Vet Adm Med Ctr, Philadelphia, PA 19104 USA

[4] Univ Penn, Treatment Res Ctr, Philadelphia, PA 19104 USA

[5] Univ Penn, Ctr Studies Addict, Dept Psychiat, Philadelphia, PA 19104 USA

来源：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION | 2021年 / 116卷 / 533期

关键词：

Cross-fitting; Data-adaptive techniques; Dynamic treatment strategies; Residual confounding; DYNAMIC TREATMENT REGIMES; DESIGN; INFERENCE; STRATEGIES; SELECTION;

D O I：

10.1080/01621459.2020.1753522

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or efficiency loss. We propose a robust Q-learning approach which allows estimating such nuisance parameters using data-adaptive techniques. We study the asymptotic behavior of our estimators and provide simulation studies that highlight the need for and usefulness of the proposed method in practice. We use the data from the "Extending Treatment Effectiveness of Naltrexone" multistage randomized trial to illustrate our proposed methods. Supplementary materials for this article are available online.

引用

页码：368 / 381

页数：14

共 51 条

[1] [Anonymous], 2003, SPR S STAT
[2] Using the Standardized Difference to Compare the Prevalence of a Binary Variable Between Two Groups in Observational Research
Austin, Peter C.
[J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2009, 38 (06) : 1228 - 1234
[3] Doubly-Robust Estimators of Treatment-Specific Survival Distributions in Observational Studies with Stratified Sampling
Bai, Xiaofei
Tsiatis, Anastasios A.
O'Brien, Sean M.
[J]. BIOMETRICS, 2013, 69 (04) : 830 - 839
[4] Bembom O, 2007, STAT APPL GENET MOL, V6
[5] Doubly robust nonparametric inference on the average treatment effect
Benkeser, D.
Carone, M.
van der Laan, M. J.
Gilbert, P. B.
[J]. BIOMETRIKA, 2017, 104 (04) : 863 - 880
[6] VALID POST-SELECTION INFERENCE
Berk, Richard
Brown, Lawrence
Buja, Andreas
Zhang, Kai
Zhao, Linda
[J]. ANNALS OF STATISTICS, 2013, 41 (02) : 802 - 837
[7] Random forests
Breiman, L
[J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
[8] Incorporating Patient Preferences into Estimation of Optimal Individualized Treatment Rules
Butler, Emily L.
Laber, Eric B.
Davis, Sonia M.
Kosorok, Michael R.
[J]. BIOMETRICS, 2018, 74 (01) : 18 - 26
[9] Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data
Cao, Weihua
Tsiatis, Anastasios A.
Davidian, Marie
[J]. BIOMETRIKA, 2009, 96 (03) : 723 - 734
[10] Chakraborty B., 2013, STAT METHODS DYNAMIC

← 1 2 3 4 5 6 →