共 33 条
A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes
被引:2
|作者:
Fan, Yanqin
[1
]
He, Ming
[2
]
Su, Liangjun
[3
]
Zhou, Xiao-Hua
[4
,5
]
机构:
[1] Univ Washington, Dept Econ, Seattle, WA 98195 USA
[2] Univ Technol Sydney, Econ Discipline Grp, Ultimo, Australia
[3] Singapore Management Univ, Sch Econ, Singapore, Singapore
[4] Peking Univ, Beijing Int Ctr Math Res, Beijing 100871, Peoples R China
[5] Peking Univ, Sch Publ Hlth, Beijing 100191, Peoples R China
基金:
中国国家自然科学基金;
关键词:
asymptotic normality;
exceptional law;
optimal smoothing parameter;
sequential randomization;
Wald-type inference;
TECHNICAL CHALLENGES;
INFERENCE;
D O I:
10.1111/sjos.12359
中图分类号:
O21 [概率论与数理统计];
C8 [统计学];
学科分类号:
020208 ;
070103 ;
0714 ;
摘要:
In this paper, we propose a smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q-learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q-learning estimator is asymptotically normally distributed even when the Q-learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q-learning estimator is standard. We derive the optimal smoothing parameter and propose a data-driven method for estimating it. The finite sample properties of the smoothed Q-learning estimator are studied and compared with several existing estimators including the Q-learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness-Alzheimer's Disease (CATIE-AD) study.
引用
收藏
页码:446 / 469
页数:24
相关论文