Protein function annotation and virulence factor identification of Klebsiella pneumoniae genome by multiple machine learning models

被引:1
作者
Qian, Jinyang [1 ]
Jin, Pengfei [1 ]
Yang, Yueyue [1 ]
Ma, Nan [1 ]
Yang, Zhiyuan [1 ,2 ]
Zhang, Xiaoli [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Artificial Intelligence, Hangzhou, Zhejiang, Peoples R China
[2] Chinese Univ Hong Kong, Sch Biomed Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Klebsiella pneumoniae; Machine learning; Virulence factor; Uncharacterized protein;
D O I
10.1016/j.micpath.2024.106727
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Klebsiella pneumoniae is a type of Gram-negative bacterium which can cause a range of infections in human. In recent years, an increasing number of strains of K. pneumoniae resistant to multiple antibiotics have emerged, posing a significant threat to public health. The protein function of this bacterium is not well known, thus a systematic investigation of K. pneumoniae proteome is in urgent need. In this study, the protein functions of this bacteria were re-annotated, and their function groups were analyzed. Moreover, three machine learning models were built to identify novel virulence factors. Results showed that the functions of 16 uncharacterized proteins were first annotated by sequence alignment. In addition, K. pneumoniae proteins share a high proportion of homology with Haemophilus influenzae and a low homology proportion with Chlamydia pneumoniae. By sequence analysis, 10 proteins were identified as potential drug targets for this bacterium. Our model achieved a high accuracy of 0.901 in the benchmark dataset. By applying our models to K. pneumoniae, we identified 39 virulence factors in this pathogen. Our findings could provide novel clues for the treatment of K. pneumoniae infection.
引用
收藏
页数:9
相关论文
共 17 条
  • [1] Prediction of genome- wide imipenem resistance features in Klebsiella pneumoniae using machine learning
    Li, Shanshan
    Wu, Jun
    Ma, Nan
    Liu, Wenjia
    Shao, Mengjie
    Ying, Nanjiao
    Zhu, Lei
    JOURNAL OF MEDICAL MICROBIOLOGY, 2023, 72 (02)
  • [2] Machine learning for identifying resistance features of Klebsiella pneumoniae using whole-genome sequence single nucleotide polymorphisms
    Liu, Wenjia
    Ying, Nanjiao
    Mo, Qiusi
    Li, Shanshan
    Shao, Mengjie
    Sun, Lingli
    Zhu, Lei
    JOURNAL OF MEDICAL MICROBIOLOGY, 2021, 70 (11)
  • [3] Evaluating Machine Learning Models for Essential Protein Identification
    Costa, Jessica da Silva
    Rodrigues, Jorge Gabriel
    Belloze, Kele
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2022, 2022, 13523 : 38 - 43
  • [4] Machine learning models for prediction of invasion Klebsiella pneumoniae liver abscess syndrome in diabetes mellitus: a singled centered retrospective study
    Chengyi Feng
    Jia Di
    Shufang Jiang
    Xuemei Li
    Fei Hua
    BMC Infectious Diseases, 23
  • [5] Machine learning models for prediction of invasion Klebsiella pneumoniae liver abscess syndrome in diabetes mellitus: a singled centered retrospective study
    Feng, Chengyi
    Di, Jia
    Jiang, Shufang
    Li, Xuemei
    Hua, Fei
    BMC INFECTIOUS DISEASES, 2023, 23 (01)
  • [6] Identification of the BolA Protein Reveals a Novel Virulence Factor in K. pneumoniae That Contributes to Survival in Host
    Zhang, Feiyang
    Yan, Xiangjin
    Bai, Jiawei
    Xiang, Li
    Ding, Manlin
    Li, Qin
    Zhang, Biying
    Liang, Qinghua
    Zhou, Yingshun
    MICROBIOLOGY SPECTRUM, 2022, 10 (05):
  • [7] Systematic Identification of Machine-Learning Models Aimed to Classify Critical Residues for Protein Function from Protein Structure
    Corral-Corral, Ricardo
    Beltran, Jesus A.
    Brizuela, Carlos A.
    Del Rio, Gabriel
    MOLECULES, 2017, 22 (10)
  • [8] Protein Language Models and Machine Learning Facilitate the Identification of Antimicrobial Peptides
    Medina-Ortiz, David
    Contreras, Seba
    Fernandez, Diego
    Soto-Garcia, Nicole
    Moya, Ivan
    Cabas-Mora, Gabriel
    Olivera-Nappa, Alvaro
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (16)
  • [9] Machine Learning Models for Predicting the Quality Factor of FSO Systems with Multiple Transceivers
    Algedir, Amal A.
    Elganimi, Taissir Y.
    2020 IEEE 2ND GLOBAL POWER, ENERGY AND COMMUNICATION CONFERENCE (IEEE GPECOM2020), 2020, : 308 - 311
  • [10] Endocrine disruptor identification and multitoxicity level assessment of organic chemicals: An example of multiple machine learning models
    Hao, Ning
    Zhao, Yuanyuan
    Sun, Peixuan
    Deng, Zhengyang
    Cui, Xiran
    Liu, Jiapeng
    Zhao, Wenjin
    JOURNAL OF HAZARDOUS MATERIALS, 2025, 485