Deep Spatial Q-Learning for Infectious Disease Control

被引：0

作者：

Zhishuai Liu

Jesse Clifton

Eric B. Laber

John Drake

Ethan X. Fang

机构：

[1] Duke University,Department of Statistical Science

[2] NC State University,Department of Statistics

[3] Duke University,Department of Statistical Science, Department of Biostatistics and Bioinformatics

[4] University of Georgia,School of Ecology

[5] Duke University,Department of Biostatistics and Bioinformatics

来源：

Journal of Agricultural, Biological and Environmental Statistics | 2023年 / 28卷

关键词：

Infectious diseases; Reinforcement learning; Graph neural networks;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Infectious diseases are a cause of humanitarian and economic crises across the world. In developing regions, a severe epidemic can result in the collapse of healthcare infrastructure or even the failure of an affected state. The most recent 2013–2015 outbreak of Ebola virus disease in West Africa is an example of such an epidemic. The economic, infrastructural, and human costs of this outbreak provide strong motivation for the examination of adaptive treatment strategies that allocate resources in response to and anticipation of the evolution of an epidemic. We formalize adaptive management of an emerging infectious disease spreading across a set of locations as a treatment regime that maps up-to-date information on the epidemic to a subset of locations identified as high-priority for treatment. An optimal treatment regime in this context is defined as maximizing the expectation of a pre-specified cumulative utility measure, e.g., the number of disease-free individuals or the estimated reduction in morbidity or mortality relative to a baseline intervention strategy. Because the disease dynamics are not known at the beginning of an outbreak, an optimal treatment regime must be estimated online, i.e., as data accumulate; thus, an effective estimation algorithm must balance choosing interventions that lead to information gain and thereby model improvement with interventions that appear to be optimal under the current estimated model. We develop a novel model-free algorithm for the online management of an infectious disease spreading over a finite set of locations and an indefinite or infinite time horizon. The proposed algorithm balances exploration and exploitation using a semi-parametric variant of Thompson sampling. We also introduce a graph neural network-based estimator in order to improve the performance of this class of algorithms. Simulations, including those mimicking the spread of the 2013–2015 Ebola outbreak, suggest that an adaptive treatment strategy has the potential to significantly reduce mortality relative to ad hoc management strategies.

引用

页码：749 / 773

页数：24

共 219 条

[1]

Agrawal S(2013)Thompson sampling for contextual bandits with linear payoffs ICML 3 127-135

[2]

Goyal N(2010)Structural nested mean models for assessing time-varying effect moderation Biometrics 66 131-139

[3]

Almirall D(1994)Feed-forward neural networks IEEE Potentials 13 27-31

[4]

Ten Have T(2007)The gurobi optimizer Transp Res Part B 41 159-178

[5]

Murphy SA(2019)Infectious disease threats in the twenty-first century: strengthening the global response Front Immunol 10 549-1637

[6]

Bebis G(2018)A comprehensive survey of graph embedding: Problems, techniques, and applications IEEE Trans Knowl Data Eng 30 1616-436

[7]

Georgiopoulos M(2005)Generalized bootstrap for estimating equations Ann Stat 33 414-556

[8]

Bixby B(2005)Tree-based batch mode reinforcement learning J Mach Learn Res 6 503-381

[9]

Bloom DE(2021)Robust q-learning J Am Stat Assoc 116 368-977

[10]

Cadarette D(2018)Constructing dynamic treatment regimes over indefinite time horizons Biometrika 105 963-862

← 1 2 3 4 5 6 7 8 9 10 →