Deep learning from phylogenies to uncover the epidemiological dynamics of outbreaks

被引:39
作者
Voznica, J. [1 ,2 ,3 ]
Zhukova, A. [1 ,4 ,5 ,6 ]
Boskova, V. [7 ,8 ]
Saulnier, E. [1 ]
Lemoine, F. [1 ,4 ]
Moslonka-Lefebvre, M. [1 ]
Gascuel, O. [1 ,9 ]
机构
[1] Univ Paris Cite, Inst Pasteur, Unite Bioinformat Evolut, Paris, France
[2] Univ Paris, Paris, France
[3] Univ Paris Sci & Lettres, CNRS, INSERM, Inst Biol,Ecole Normale Super,Ecole Normale Super, Paris, France
[4] Univ Paris Cite, Inst Pasteur, Bioinformat & Biostat Hub, Paris, France
[5] Univ Paris Cite, Inst Pasteur, Epidemiol & Modelling Antibiot Evas, Paris, France
[6] Univ Paris Saclay, UVSQ, INSERM, CESP, Villejuif, France
[7] Univ Vienna, Max Perutz Labs, Ctr Integrat Bioinformat Vienna, Vienna, Austria
[8] Med Univ Vienna, Vienna, Austria
[9] CNRS, Museum Natl Hist Nat, Inst Systemat, Evolut,Biodivers,UMR 7205,SU,EPHE,UA, Paris, France
基金
瑞士国家科学基金会;
关键词
HIV; TRANSMISSION; INFERENCE; MODEL;
D O I
10.1038/s41467-022-31511-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Widely applicable, accurate and fast inference methods in phylodynamics are needed to fully profit from the richness of genetic data in uncovering the dynamics of epidemics. Standard methods, including maximum-likelihood and Bayesian approaches, generally rely on complex mathematical formulae and approximations, and do not scale with dataset size. We develop a likelihood-free, simulation-based approach, which combines deep learning with (1) a large set of summary statistics measured on phylogenies or (2) a complete and compact representation of trees, which avoids potential limitations of summary statistics and applies to any phylodynamics model. Our method enables both model selection and estimation of epidemiological parameters from very large phylogenies. We demonstrate its speed and accuracy on simulated data, where it performs better than the state-of-the-art methods. To illustrate its applicability, we assess the dynamics induced by superspreading individuals in an HIV dataset of men-having-sex-with-men in Zurich. Our tool PhyloDeep is available on github. com/evolbioinfo/phylodeep.
引用
收藏
页数:14
相关论文
共 57 条
[1]   Improving the Accuracy of Demographic and Molecular Clock Model Comparison While Accommodating Phylogenetic Uncertainty [J].
Baele, Guy ;
Lemey, Philippe ;
Bedford, Trevor ;
Rambaut, Andrew ;
Suchard, Marc A. ;
Alekseyenko, Alexander V. .
MOLECULAR BIOLOGY AND EVOLUTION, 2012, 29 (09) :2157-2167
[2]  
Beaumont MA, 2002, GENETICS, V162, P2025
[3]  
Blum M. G. B., 2018, HDB APPROXIMATE BAYE
[4]   The influence of phylodynamic model specifications on parameter estimates of the Zika virus epidemic [J].
Boskova, Veronika ;
Stadler, Tanja ;
Magnus, Carsten .
VIRUS EVOLUTION, 2018, 4 (01)
[5]   Inference of Epidemiological Dynamics Based on Simulated Phylogenies Using Birth-Death and Coalescent Models [J].
Boskova, Veronika ;
Bonhoeffer, Sebastian ;
Stadler, Tanja .
PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (11)
[6]   BEAST 2: A Software Platform for Bayesian Evolutionary Analysis [J].
Bouckaert, Remco ;
Heled, Joseph ;
Kuehnert, Denise ;
Vaughan, Tim ;
Wu, Chieh-Hsi ;
Xie, Dong ;
Suchard, Marc A. ;
Rambaut, Andrew ;
Drummond, Alexei J. .
PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (04)
[7]   High rates of forward transmission events after acute/early HIV-1 infection [J].
Brenner, Bluma G. ;
Roger, Michel ;
Routy, Jean-Pierre ;
Moisi, Daniela ;
Ntemgwa, Michel ;
Matte, Claudine ;
Baril, Jean-Guy ;
Thomas, Rejean ;
Rouleau, Danielle ;
Bruneau, Julie ;
Leblanc, Roger ;
Legault, Mario ;
Tremblay, Cecile ;
Charest, Hugues ;
Wainberg, Mark A. .
JOURNAL OF INFECTIOUS DISEASES, 2007, 195 (07) :951-959
[8]   Predicting clustered weather patterns: A test case for applications of convolutional neural networks to spatio-temporal climate data [J].
Chattopadhyay, Ashesh ;
Hassanzadeh, Pedram ;
Pasha, Saba .
SCIENTIFIC REPORTS, 2020, 10 (01)
[9]  
Chollet F. K, 2015, CHOLLET FK
[10]  
Clevert D.-A., 2016, C TRACK P, P1