Predicting hosts and cross-species transmission of Streptococcus agalactiae by interpretable machine learning

被引:2
作者
Ren, Yunxiao [1 ]
Li, Carmen [2 ]
Sapugahawatte, Dulmini Nanayakkara [2 ]
Zhu, Chendi [2 ]
Spaenig, Sebastian [1 ]
Jamrozy, Dorota [3 ]
Rothen, Julian [4 ,5 ]
Daubenberger, Claudia A. [4 ,5 ]
Bentley, Stephen D. [3 ]
Ip, Margaret [2 ]
Heider, Dominik [1 ,6 ,7 ]
机构
[1] Philipps Univ Marburg, Fac Math & Comp Sci, Dept Data Sci Biomed, Marburg, Germany
[2] Chinese Univ Hong Kong, Fac Med, Dept Microbiol, Hong Kong, Peoples R China
[3] Wellcome Sanger Inst, Parasites & Microbes Programme, Wellcome Genome Campus, Cambridge, England
[4] Swiss Trop & Publ Hlth Inst Swiss TPH Basel, Dept Med Parasitol & Infect Biol, CH-4002 Basel, Switzerland
[5] Univ Basel, CH-4002 Basel, Switzerland
[6] Univ Dusseldorf, Inst Comp Sci, D-40211 Dusseldorf, Germany
[7] Heinrich Heine Univ Dusseldorf, Ctr Digital Hlth, Moorenstr 5, D-40225 Dusseldorf, Germany
关键词
Hosts prediction; Host adaptations; Cross-species transmission; Interpretable machine learning; GROUP-B STREPTOCOCCUS; SEQUENCE TYPE 283; FISH;
D O I
10.1016/j.compbiomed.2024.108185
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Streptococcus agalactiae, commonly known as Group B Streptococcus (GBS), exhibits a broad host range, manifesting as both a beneficial commensal and an opportunistic pathogen across various species. In humans, it poses significant risks, causing neonatal sepsis and meningitis, along with severe infections in adults. Additionally, it impacts livestock by inducing mastitis in bovines and contributing to epidemic mortality in fish populations. Despite its wide host spectrum, the mechanisms enabling GBS to adapt to specific hosts remain inadequately elucidated. Therefore, the development of a rapid and accurate method differentiates GBS strains associated with particular animal hosts based on genome-wide information holds immense potential. Such a tool would not only bolster the identification and containment efforts during GBS outbreaks but also deepen our comprehension of the bacteria's host adaptations spanning humans, livestock, and other natural animal reservoirs. Methods and results: Here, we developed three machine learning models-random forest (RF), logistic regression (LR), and support vector machine (SVM) based on genome-wide mutation data. These models enabled precise prediction of the host origin of GBS, accurately distinguishing between human, bovine, fish, and pig hosts. Moreover, we conducted an interpretable machine learning using SHapley Additive exPlanations (SHAP) and variant annotation to uncover the most influential genomic features and associated genes for each host. Additionally, by meticulously examining misclassified samples, we gained valuable insights into the dynamics of host transmission and the potential for zoonotic infections. Conclusions: Our study underscores the effectiveness of random forest (RF) and logistic regression (LR) models based on mutation data for accurately predicting GBS host origins. Additionally, we identify the key features associated with each GBS host, thereby enhancing our understanding of the bacteria's host -specific adaptations.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Predicting and interpreting digital platform survival: An interpretable machine learning approach
    Zhu, Xinyu
    Zhang, Qiang
    Ma, Baojun
    ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2024, 67
  • [22] Interpretable machine-learning models for predicting creep recovery of concrete
    Mei, Shengqi
    Liu, Xiaodong
    Wang, Xingju
    Li, Xufeng
    STRUCTURAL CONCRETE, 2024,
  • [23] Interpretable machine learning for predicting evaporation from Awash reservoirs, Ethiopia
    Kidist Demessie Eshetu
    Tena Alamirew
    Tekalegn Ayele Woldesenbet
    Earth Science Informatics, 2023, 16 (4) : 3209 - 3226
  • [24] Interpretable machine learning for predicting evaporation from Awash reservoirs, Ethiopia
    Eshetu, Kidist Demessie
    Alamirew, Tena
    Woldesenbet, Tekalegn Ayele
    EARTH SCIENCE INFORMATICS, 2023, 16 (04) : 3209 - 3226
  • [25] Exploring Potential Intermediates in the Cross-Species Transmission of Influenza A Virus to Humans
    Lee, Chung-Young
    VIRUSES-BASEL, 2024, 16 (07):
  • [26] Influenza Virus Genomic Mutations, Host Barrier and Cross-Species Transmission
    Xiong, Wenyan
    Zhang, Zongde
    CURRENT GENOMICS, 2024, : 161 - 174
  • [27] Cross-species transmission and animal infection model of hepatitis E virus
    Xu, Ling-Dong
    Zhang, Fei
    Xu, Pinglong
    Huang, Yao-Wei
    MICROBES AND INFECTION, 2025, 27 (01)
  • [28] Cross-species transmission of the newly identified coronavirus 2019-nCoV
    Ji, Wei
    Wang, Wei
    Zhao, Xiaofang
    Zai, Junjie
    Li, Xingguang
    JOURNAL OF MEDICAL VIROLOGY, 2020, 92 (04) : 433 - 440
  • [29] A panoramic view of the molecular epidemiology, evolution, and cross-species transmission of rosaviruses
    Zhang, Minyi
    Fan, Shunchang
    Liang, Minyi
    Wu, Ruojun
    Tian, Jingli
    Xian, Juxian
    Zhou, Xiaofeng
    Chen, Qing
    VETERINARY RESEARCH, 2024, 55 (01) : 145
  • [30] Porcine Deltacoronaviruses: Origin, Evolution, Cross-Species Transmission and Zoonotic Potential
    Kong, Fanzhi
    Wang, Qiuhong
    Kenney, Scott P.
    Jung, Kwonil
    Vlasova, Anastasia N.
    Saif, Linda J.
    PATHOGENS, 2022, 11 (01):