Predicting the short-term success of human influenza virus variants with machine learning

被引:13
作者
Hayati, Maryam [1 ]
Biller, Priscila [2 ]
Colijn, Caroline [2 ,3 ]
机构
[1] Simon Fraser Univ, Dept Comp Sci, Burnaby, BC V5A 1S6, Canada
[2] Simon Fraser Univ, Dept Math, Burnaby, BC V5A 1S6, Canada
[3] Imperial Coll London, Dept Math, London SW7 2BU, England
基金
英国工程与自然科学研究理事会;
关键词
influenza; phylogenetics; machine learning; prediction; tree shape statistics; TRANSMISSION; DYNAMICS; FITNESS; HIV;
D O I
10.1098/rspb.2020.0319
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Seasonal influenza viruses are constantly changing and produce a different set of circulating strains each season. Small genetic changes can accumulate over time and result in antigenically different viruses; this may prevent the body's immune system from recognizing those viruses. Due to rapid mutations, in particular, in the haemagglutinin (HA) gene, seasonal influenza vaccines must be updated frequently. This requires choosing strains to include in the updates to maximize the vaccines' benefits, according to estimates of which strains will be circulating in upcoming seasons. This is a challenging prediction task. In this paper, we use longitudinally sampled phylogenetic trees based on HA sequences from human influenza viruses, together with counts of epitope site polymorphisms in HA, to predict which influenza virus strains are likely to be successful. We extract small groups of taxa (subtrees) and use a suite of features of these subtrees as key inputs to the machine learning tools. Using a range of training and testing strategies, including training on H3N2 and testing on H1N1, we find that successful prediction of future expansion of small subtrees is possible from these data, with accuracies of 0.71-0.85 and a classifier 'area under the curve' 0.75-0.9.
引用
收藏
页数:10
相关论文
共 50 条
[1]  
[Anonymous], SEASONAL INFLUENZA C
[2]  
[Anonymous], METHODS MOL BIOL MET
[3]  
[Anonymous], 2018, ARXIV181104972
[4]  
[Anonymous], PHYLOTOP CALCULATING
[5]  
[Anonymous], 2017, Data mining with r: Learning with case studies
[6]  
[Anonymous], TREECENTRALITY TREEC
[7]  
[Anonymous], SEL VIR SEAS INFL VA
[8]   Taming the BEAST-A Community Teaching Material Resource for BEAST 2 [J].
Barido-Sottani, Joelle ;
Boskova, Veronika ;
Du Plessis, Louis ;
Kuhnert, Denise ;
Magnus, Carsten ;
Mitov, Venelin ;
Mueller, Nicola F. ;
Pecerska, Julija ;
Rasmussen, David A. ;
Zhang, Chi ;
Drummond, Alexei J. ;
Heath, Tracy A. ;
Pybus, Oliver G. ;
Vaughan, Timothy G. ;
Stadler, Tanja .
SYSTEMATIC BIOLOGY, 2018, 67 (01) :170-174
[9]   WHO recommendations for the viruses used in the 2013-2014 Northern Hemisphere influenza vaccine: Epidemiology, antigenic and genetic characteristics of influenza A(H1N1)pdm09, A(H3N2) and B influenza viruses collected from October 2012 to January 2013 [J].
Barr, Ian G. ;
Russell, Colin ;
Besselaar, Terry G. ;
Cox, Nancy J. ;
Daniels, Rod S. ;
Donis, Ruben ;
Engelhardt, Othmar G. ;
Grohmann, Gary ;
Itamura, Shigeyuki ;
Kelso, Anne ;
McCauley, John ;
Odagiri, Takato ;
Schultz-Cherry, Stacey ;
Shu, Yuelong ;
Smith, Derek ;
Tashiro, Masato ;
Wang, Dayan ;
Webby, Richard ;
Xu, Xiyan ;
Ye, Zhiping ;
Zhang, Wenqing .
VACCINE, 2014, 32 (37) :4713-4725
[10]  
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkp1024, 10.1093/nar/gkn723, 10.1093/nar/gks1195, 10.1093/nar/gkw1070, 10.1093/nar/gkl986, 10.1093/nar/gkg057, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkq1079]