Deep Spatial Q-Learning for Infectious Disease Control

被引:1
作者
Liu, Zhishuai [1 ]
Clifton, Jesse [2 ]
Laber, Eric B. [3 ]
Drake, John [4 ]
Fang, Ethan X. [5 ]
机构
[1] Duke Univ, Dept Stat Sci, Durham, NC USA
[2] NC State Univ, Dept Stat, Raleigh, NC USA
[3] Duke Univ, Dept Stat Sci, Dept Biostat & Bioinformat, Durham, NC 27708 USA
[4] Univ Georgia, Sch Ecol, Athens, GA USA
[5] Duke Univ, Dept Biostat & Bioinformat, Durham, NC USA
基金
美国国家科学基金会;
关键词
Infectious diseases; Reinforcement learning; Graph neural networks; DYNAMIC TREATMENT REGIMES; EBOLA-VIRUS DISEASE; CAUSAL INFERENCE; REGRESSION; MODELS; TIME; EPIDEMIC;
D O I
10.1007/s13253-023-00551-4
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Infectious diseases are a cause of humanitarian and economic crises across the world. In developing regions, a severe epidemic can result in the collapse of healthcare infrastructure or even the failure of an affected state. The most recent 2013-2015 outbreak of Ebola virus disease in West Africa is an example of such an epidemic. The economic, infrastructural, and human costs of this outbreak provide strong motivation for the examination of adaptive treatment strategies that allocate resources in response to and anticipation of the evolution of an epidemic. We formalize adaptive management of an emerging infectious disease spreading across a set of locations as a treatment regime that maps up-to-date information on the epidemic to a subset of locations identified as high-priority for treatment. An optimal treatment regime in this context is defined as maximizing the expectation of a pre-specified cumulative utility measure, e.g., the number of disease-free individuals or the estimated reduction in morbidity or mortality relative to a baseline intervention strategy. Because the disease dynamics are not known at the beginning of an outbreak, an optimal treatment regime must be estimated online, i.e., as data accumulate; thus, an effective estimation algorithm must balance choosing interventions that lead to information gain and thereby model improvement with interventions that appear to be optimal under the current estimated model. We develop a novel model-free algorithm for the online management of an infectious disease spreading over a finite set of locations and an indefinite or infinite time horizon. The proposed algorithm balances exploration and exploitation using a semi-parametric variant of Thompson sampling. We also introduce a graph neural network-based estimator in order to improve the performance of this class of algorithms. Simulations, including those mimicking the spread of the 2013-2015 Ebola outbreak, suggest that an adaptive treatment strategy has the potential to significantly reduce mortality relative to ad hoc management strategies.Supplementary materials accompanying this paper appear online.
引用
收藏
页码:749 / 773
页数:25
相关论文
共 114 条
[1]  
Agrawal S, 2011, ARXIV
[2]  
Agrawal Shipra., 2013, International Conference on Machine Learning, P127, DOI DOI 10.5555/3042817.3043073
[3]   Structural Nested Mean Models for Assessing Time-Varying Effect Moderation [J].
Almirall, Daniel ;
Ten Have, Thomas ;
Murphy, Susan A. .
BIOMETRICS, 2010, 66 (01) :131-139
[4]  
Arulkumaran K, 2017, ARXIV
[5]   SEQUENCES CONVERGING TO D-OPTIMAL DESIGNS OF EXPERIMENTS [J].
ATWOOD, CL .
ANNALS OF STATISTICS, 1973, 1 (02) :342-352
[6]  
Auer P, 2000, ANN IEEE SYMP FOUND, P270, DOI 10.1109/SFCS.2000.892116
[7]   Ebola Virus Disease in West Africa - The First 9 Months of the Epidemic and Forward Projections [J].
Aylward, Bruce ;
Barboza, Philippe ;
Bawo, Luke ;
Bertherat, Eric ;
Bilivogui, Pepe ;
Blake, Isobel ;
Brennan, Rick ;
Briand, Sylvie ;
Chakauya, Jethro Magwati ;
Chitala, Kennedy ;
Conteh, Roland M. ;
Cori, Anne ;
Croisier, Alice ;
Dangou, Jean-Marie ;
Diallo, Boubacar ;
Donnelly, Christl A. ;
Dye, Christopher ;
Eckmanns, Tim ;
Ferguson, Neil M. ;
Formenty, Pierre ;
Fuhrer, Caroline ;
Fukuda, Keiji ;
Garske, Tini ;
Gasasira, Alex ;
Gbanyan, Stephen ;
Graaff, Peter ;
Heleze, Emmanuel ;
Jambai, Amara ;
Jombart, Thibaut ;
Kasolo, Francis ;
Kadiobo, Albert Mbule ;
Keita, Sakoba ;
Kertesz, Daniel ;
Kone, Moussa ;
Lane, Chris ;
Markoff, Jered ;
Massaquoi, Moses ;
Mills, Harriet ;
Mulba, John Mike ;
Musa, Emmanuel ;
Myhre, Joel ;
Nasidi, Abdusalam ;
Nilles, Eric ;
Nouvellet, Pierre ;
Nshimirimana, Deo ;
Nuttall, Isabelle ;
Nyenswah, Tolbert ;
Olu, Olushayo ;
Pendergast, Scott ;
Perea, William .
NEW ENGLAND JOURNAL OF MEDICINE, 2014, 371 (16) :1481-1495
[8]  
Bartroff J., 2012, Sequential experimentation in clinical trials: design and analysis
[9]   Feed-forward neural networks [J].
Bebis, George ;
Georgiopoulos, Michael .
IEEE Potentials, 1994, 13 (04) :27-31
[10]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&