Improving Deep Reinforcement Learning With Transitional Variational Autoencoders: A Healthcare Application

被引:21
作者
Baucum, Matthew [1 ]
Khojandi, Anahita [1 ]
Vasudevan, Rama [2 ]
机构
[1] Univ Tennessee, Dept Ind & Syst Engn, Knoxville, TN 37996 USA
[2] Oak Ridge Natl Lab, Ctr Nanophase Mat Sci, Oak Ridge, TN 37830 USA
关键词
Hidden Markov models; Data models; Neural networks; Training; Trajectory; Biomedical measurement; Reinforcement learning; hidden Markov models; variational autoencoders; generative adversarial networks; long short-term memory networks;
D O I
10.1109/JBHI.2020.3027443
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning is a powerful tool for developing personalized treatment regimens from healthcare data. Yet training reinforcement learning agents through direct interactions with patients is often impractical for ethical reasons. One solution is to train reinforcement learning agents using an 'environment model,' which is learned from retrospective patient data, and can simulate realistic patient trajectories. In this study, we propose transitional variational autoencoders (tVAE), a generative neural network architecture that learns a direct mapping between distributions over clinical measurements at adjacent time points. Unlike other models, the tVAE requires few distributional assumptions, and benefits from identical training, and testing architectures. This model produces more realistic patient trajectories than state-of-the-art sequential decision-making models, and generative neural networks, and can be used to learn effective treatment policies.
引用
收藏
页码:2273 / 2280
页数:8
相关论文
共 27 条
[1]  
[Anonymous], SOFTWARE
[2]  
Antipov G, 2017, IEEE IMAGE PROC, P2089, DOI 10.1109/ICIP.2017.8296650
[3]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&
[4]  
Bowman S. R., 2016, CoNLL 2016-20th SIGNLL Conf. Comput. Nat. Lang. Learn. Proc, P10, DOI DOI 10.18653/V1/K16-1002
[5]  
Choi E., 2017, PMLR, P286
[6]   Machine learning for comprehensive forecasting of Alzheimer's Disease progression [J].
Fisher, Charles K. ;
Smith, Aaron M. ;
Walsh, Jonathan R. ;
Simone, Adam J. ;
Edgar, Chris ;
Jack, Clifford R. ;
Holtzman, David ;
Russell, David ;
Hill, Derek ;
Grosset, Donald ;
Wood, Fred ;
Vanderstichele, Hugo ;
Morris, John ;
Blennown, Kaj ;
Marek, Ken ;
Shaw, Leslie M. ;
Albert, Marilyn ;
Weiner, Michael ;
Fox, Nick ;
Aisen, Paul ;
Cole, Patricia E. ;
Petersen, Ronald ;
Sherer, Todd ;
Kubick, Wayne .
SCIENTIFIC REPORTS, 2019, 9 (1)
[7]   Outcome of postoperative critically ill patients with heparin-induced thrombocytopenia: an observational retrospective case-control study [J].
Gettings, Elise M. ;
Brush, Kathryn A. ;
Van Cott, Elizabeth M. ;
Hurford, William E. .
CRITICAL CARE, 2006, 10 (06)
[8]   Training recurrent neural networks robust to incomplete data: Application to Alzheimer's disease progression modeling [J].
Ghazi, Mostafa Mehdipour ;
Nielsen, Mads ;
Pai, Akshay ;
Cardoso, M. Jorge ;
Modat, Marc ;
Ourselin, Sebastien ;
Sorensen, Lauge .
MEDICAL IMAGE ANALYSIS, 2019, 53 (39-46) :39-46
[9]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[10]   MIMIC-III, a freely accessible critical care database [J].
Johnson, Alistair E. W. ;
Pollard, Tom J. ;
Shen, Lu ;
Lehman, Li-wei H. ;
Feng, Mengling ;
Ghassemi, Mohammad ;
Moody, Benjamin ;
Szolovits, Peter ;
Celi, Leo Anthony ;
Mark, Roger G. .
SCIENTIFIC DATA, 2016, 3