VIDHOP, viral host prediction with deep learning

被引:27
作者
Mock, Florian [1 ]
Viehweger, Adrian [1 ]
Barth, Emanuel [2 ]
Marz, Manja [1 ,3 ,4 ,5 ]
机构
[1] Friedrich Schiller Univ Jena, Fac Math & Comp Sci, RNA Bioinformat High Throughput Anal, D-07743 Jena, Germany
[2] Friedrich Schiller Univ Jena, Bioinformat Core Facil Jena, D-07743 Jena, Germany
[3] Leibnitz Inst Age Res, RNA Bioinformat High Throughput Anal, Fritz Lipmann Inst FLI, D-07743 Jena, Germany
[4] German Ctr Integrat Biodivers Res iDiv, RNA Bioinformat High Throughput Anal, D-04103 Halle, Germany
[5] European Virus Bioinformat Ctr EVBC, RNA Bioinformat High Throughput Anal, D-07743 Jena, Germany
关键词
ALIGNMENT-FREE; VIRUS; ADAPTATION;
D O I
10.1093/bioinformatics/btaa705
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Zoonosis, the natural transmission of infections from animals to humans, is a far-reaching global problem. The recent outbreaks of Zikavirus, Ebolavirus and Coronavirus are examples of viral zoonosis, which occur more frequently due to globalization. In case of a virus outbreak, it is helpful to know which host organism was the original carrier of the virus to prevent further spreading of viral infection. Recent approaches aim to predict a viral host based on the viral genome, often in combination with the potential host genome and arbitrarily selected features. These methods are limited in the number of different hosts they can predict or the accuracy of the prediction. Results: Here, we present a fast and accurate deep learning approach for viral host prediction, which is based on the viral genome sequence only. We tested our deep neural network (DNN) on three different virus species (influenza A virus, rabies lyssavirus and rotavirus A). We achieved for each virus species an AUC between 0.93 and 0.98, allowing highly accurate predictions while using only fractions (100-400 bp) of the viral genome sequences. We show that deep neural networks are suitable to predict the host of a virus, even with a limited amount of sequences and highly unbalanced available data. The trained DNNs are the core of our virus-host prediction tool Virus Deep learning HOst Prediction (VIDHOP). VIDHOP also allows the user to train and use models for other viruses.
引用
收藏
页码:318 / 325
页数:8
相关论文
共 33 条
  • [1] ABADI M, 2016, P OSDI 16 12 USENIX, V12, P265
  • [2] Alignment-free d2* oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences
    Ahlgren, Nathan A.
    Ren, Jie
    Lu, Yang Young
    Fuhrman, Jed A.
    Sun, Fengzhu
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (01) : 39 - 53
  • [3] Al-Rfou R, 2019, AAAI CONF ARTIF INTE, P3159
  • [4] Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences
    Bahir, Iris
    Fromer, Menachem
    Prat, Yosef
    Linial, Michal
    [J]. MOLECULAR SYSTEMS BIOLOGY, 2009, 5
  • [5] Chollet F., 2015, KERAS 20 COMPUTER SO
  • [6] Genetic characterization of Tribec virus and Kemerovo virus, two tick-transmitted human-pathogenic Orbiviruses
    Dilcher, Meik
    Hasib, Lekbira
    Lechner, Marcus
    Wieseke, Nicolas
    Middendorf, Martin
    Marz, Manja
    Koch, Andrea
    Spiegel, Martin
    Dobler, Gerhard
    Hufert, Frank T.
    Weidmann, Manfred
    [J]. VIROLOGY, 2012, 423 (01) : 68 - 76
  • [7] Computational approaches to predict bacteriophage-host relationships
    Edwards, Robert A.
    McNair, Katelyn
    Faust, Karoline
    Raes, Jeroen
    Dutilh, Bas E.
    [J]. FEMS MICROBIOLOGY REVIEWS, 2016, 40 (02) : 258 - 272
  • [8] Predicting host tropism of influenza A virus proteins using random forest
    Eng, Christine L. P.
    Tong, Joo Chuan
    Tan, Tin Wee
    [J]. BMC MEDICAL GENOMICS, 2014, 7
  • [9] WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs
    Galiez, Clovis
    Siebert, Matthias
    Enault, Francois
    Vincent, Jonathan
    Soeding, Johannes
    [J]. BIOINFORMATICS, 2017, 33 (19) : 3113 - 3114
  • [10] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]