RNAVirHost: a machine learning-based method for predicting hosts of RNA viruses through viral genomes

被引:0
|
作者
Chen, Guowei [1 ]
Jiang, Jingzhe [2 ]
Sun, Yanni [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Kowloon, 83 Tat Chee Ave, Hong Kong, Peoples R China
[2] Chinese Acad Fishery Sci, South China Sea Fisheries Res Inst, Key Lab South China Sea Fishery Resources Exploita, Minist Agr & Rural Affairs, Guangzhou 510300, Peoples R China
来源
GIGASCIENCE | 2024年 / 13卷
关键词
RNA virus; host prediction; machine learning; metagenomics; MOLECULAR CHARACTERIZATION; VECTORS;
D O I
10.1093/gigascience/giae059
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The high-throughput sequencing technologies have revolutionized the identification of novel RNA viruses. Given that viruses are infectious agents, identifying hosts of these new viruses carries significant implications for public health and provides valuable insights into the dynamics of the microbiome. However, determining the hosts of these newly discovered viruses is not always straightforward, especially in the case of viruses detected in environmental samples. Even for host-associated samples, it is not always correct to assign the sample origin as the host of the identified viruses. The process of assigning hosts to RNA viruses remains challenging due to their high mutation rates and vast diversity.Results In this study, we introduce RNAVirHost, a machine learning-based tool that predicts the hosts of RNA viruses solely based on viral genomes. RNAVirHost is a hierarchical classification framework that predicts hosts at different taxonomic levels. We demonstrate the superior accuracy of RNAVirHost in predicting hosts of RNA viruses through comprehensive comparisons with various state-of-the-art techniques. When applying to viruses from novel genera, RNAVirHost achieved the highest accuracy of 84.3%, outperforming the alignment-based strategy by 12.1%.Conclusions The application of machine learning models has proven beneficial in predicting hosts of RNA viruses. By integrating genomic traits and sequence homologies, RNAVirHost provides a cost-effective and efficient strategy for host prediction. We believe that RNAVirHost can greatly assist in RNA virus analyses and contribute to pandemic surveillance.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Machine Learning-Based Method for Predicting Compressive Strength of Concrete
    Li, Daihong
    Tang, Zhili
    Kang, Qian
    Zhang, Xiaoyu
    Li, Youhua
    PROCESSES, 2023, 11 (02)
  • [2] A machine learning-based service for estimating quality of genomes using PATRIC
    Parrello, Bruce
    Butler, Rory
    Chlenski, Philippe
    Olson, Robert
    Oyerbeek, Jamie
    Pusch, Gordon D.
    Vonstein, Veronika
    Overbeek, Ross
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [3] A machine learning-based service for estimating quality of genomes using PATRIC
    Bruce Parrello
    Rory Butler
    Philippe Chlenski
    Robert Olson
    Jamie Overbeek
    Gordon D. Pusch
    Veronika Vonstein
    Ross Overbeek
    BMC Bioinformatics, 20
  • [4] Predicting viral proteins that evade the innate immune system: a machine learning-based immunoinformatics tool
    Beltran, Jorge F.
    Herrera Belen, Lisandra
    Yanez, Alejandro J.
    Jimenez, Luis
    BMC BIOINFORMATICS, 2024, 25 (01):
  • [5] ProsmORF-pred: a machine learning-based method for the identification of small ORFs in prokaryotic genomes
    Khanduja, Akshay
    Kumar, Manish
    Mohanty, Debasisa
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (03)
  • [6] On the interpretability of machine learning-based model for predicting hypertension
    Elshawi, Radwa
    Al-Mallah, Mouaz H.
    Sakr, Sherif
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)
  • [7] On the interpretability of machine learning-based model for predicting hypertension
    Radwa Elshawi
    Mouaz H. Al-Mallah
    Sherif Sakr
    BMC Medical Informatics and Decision Making, 19
  • [8] Predicting mergers & acquisitions: A machine learning-based approach
    Zhao, Yuchen
    Bi, Xiaogang
    Ma, Qing-Ping
    INTERNATIONAL REVIEW OF FINANCIAL ANALYSIS, 2025, 99
  • [9] VirHunter: A Deep Learning-Based Method for Detection of Novel RNA Viruses in Plant Sequencing Data
    Sukhorukov, Grigorii
    Khalili, Maryam
    Gascuel, Olivier
    Candresse, Thierry
    Marais-Colombel, Armelle
    Nikolski, Macha
    FRONTIERS IN BIOINFORMATICS, 2022, 2
  • [10] A Machine Learning-Based Evaluation Method for Machine Translation
    Kotani, Katsunori
    Yoshimi, Takehiko
    ARTIFICIAL INTELLIGENCE: THEORIES, MODELS AND APPLICATIONS, PROCEEDINGS, 2010, 6040 : 351 - +