Accommodating misclassification effects on optimizing dynamic treatment regimes with Q-learning

被引：1

作者：

Charvadeh, Yasin Khadem ^{[1
]}

Yi, Grace Y. ^{[1
,2
,3
]}

机构：

[1] Univ Western Ontario, Dept Stat & Actuarial Sci, London, ON, Canada

[2] Univ Western Ontario, Dept Comp Sci, London, ON, Canada

[3] Univ Western Ontario, Dept Stat & Actuarial Sci, Dept Comp Sci, 1151 Richmond St, London, ON N6A 5B7, Canada

来源：

STATISTICS IN MEDICINE | 2024年 / 43卷 / 03期

基金：

加拿大自然科学与工程研究理事会;

关键词：

dynamic treatment regimes; estimating function; misclassification; Q-learning; regression calibration; regression models; SEQUENCED TREATMENT ALTERNATIVES; PROPORTIONAL HAZARDS MODEL; INFERENCE; REGRESSION; RATIONALE; DESIGN;

D O I：

10.1002/sim.9973

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Research on dynamic treatment regimes has enticed extensive interest. Many methods have been proposed in the literature, which, however, are vulnerable to the presence of misclassification in covariates. In particular, although Q-learning has received considerable attention, its applicability to data with misclassified covariates is unclear. In this article, we investigate how ignoring misclassification in binary covariates can impact the determination of optimal decision rules in randomized treatment settings, and demonstrate its deleterious effects on Q-learning through empirical studies. We present two correction methods to address misclassification effects on Q-learning. Numerical studies reveal that misclassification in covariates induces non-negligible estimation bias and that the correction methods successfully ameliorate bias in parameter estimation.

引用

页码：578 / 605

页数：28

共 50 条

[41] High-speed railway dynamic scheduling based on Q-learning method
Han X.-C.
Yu S.-P.
Yuan Z.-M.
Cheng L.-J.
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2021, 38 (10): : 1511 - 1521
[42] Optimizing wireless sensor network routing with Q-learning: enhancing energy efficiency and network longevity
Jain, Amit Kumar
Jain, Sushil
Mathur, Garima
ENGINEERING RESEARCH EXPRESS, 2024, 6 (04):
[43] Adaptation of Q-learning in dynamic shortest paths problem based on elecronic maps
Zou Liang
Xu Jian-min
PROCEEDINGS OF 2005 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1 AND 2, 2005, : 1795 - 1798
[44] Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning
Zhicong Zhang
Li Zheng
Michael X. Weng
The International Journal of Advanced Manufacturing Technology, 2007, 34 : 968 - 980
[45] A Dynamic Building Method of Mobile Agent Migrating Path Based on Q-Learning
Cheng, Yuan-cai
Wang, Xiao-lin
2014 IEEE 7TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC), 2014, : 270 - 275
[46] Dynamic Attack Detection in IoT Networks: An Ensemble Learning Approach With Q-Learning and Explainable AI
Turaka, Padmasri
Panigrahy, Saroj Kumar
IEEE ACCESS, 2024, 12 : 161925 - 161940
[47] Optimizing the Post-disaster Resource Allocation with Q-Learning: Demonstration of 2021 China Flood
Dong, Linhao
Bai, Yanbing
Xu, Qingsong
Mas, Erick
DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022, PT II, 2022, 13427 : 256 - 262
[48] A Q-Learning Approach for Optimizing the Impact of Musical Education Using Virtual Reality and Social Robots
Fengmei, He
MOBILE NETWORKS & APPLICATIONS, 2024,
[49] Resilient Cloud Control System: Dynamic Frequency Adaptation via Q-learning
Akbarian, Fatemeh
Tarneberg, William
Fitzgerald, Emma
Kihl, Maria
PROCEEDINGS OF THE 27TH CONFERENCE ON INNOVATION IN CLOUDS, INTERNET AND NETWORKS, ICIN, 2024, : 242 - 249
[50] Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning
Zhang, Zhicong
Zheng, Li
Weng, Michael X.
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2007, 34 (9-10) : 968 - 980

← 1 2 3 4 5 →