T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors

被引:24
|
作者
Hui, Xinjie [1 ]
Chen, Zewei [1 ]
Lin, Mingxiong [2 ]
Zhang, Junya [1 ]
Hu, Yueming [1 ]
Zeng, Yingying [1 ]
Cheng, Xi [1 ]
Le Ou-Yang [2 ]
Sun, Ming-an [3 ]
White, Aaron P. [4 ]
Wang, Yejun [1 ]
机构
[1] Shenzhen Univ Hlth Sci, Sch Basic Med, Dept Cell Biol & Genet, Shenzhen, Peoples R China
[2] Shenzhen Univ, Coll Informat Engn, Guangdong Key Lab Intelligent Informat Proc, Shenzhen Key Lab Media Secur, Shenzhen, Peoples R China
[3] Yangzhou Univ, Coll Vet Med, Yangzhou, Jiangsu, Peoples R China
[4] Univ Saskatchewan, VIDO InterVac, Saskatoon, SK, Canada
关键词
effector; machine learning; prediction; T3SEpp; T3SS; type III secretion system; HIDDEN MARKOV MODEL; VIRULENCE FACTORS; SYSTEM; IDENTIFICATION; PROTEINS; TRANSLOCATION; TYPHIMURIUM; BINDING; IV;
D O I
10.1128/mSystems.00288-20
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Many Gram-negative bacteria infect hosts and cause diseases by translocating a variety of type III secreted effectors (T3SEs) into the host cell cytoplasm. However, despite a dramatic increase in the number of available whole-genome sequences, it remains challenging for accurate prediction of T3SEs. Traditional prediction models have focused on atypical sequence features buried in the N-terminal peptides of T3SEs, but unfortunately, these models have had high false-positive rates. In this research, we integrated promoter information along with characteristic protein features for signal regions, chaperone-binding domains, and effector domains for T3SE prediction. Machine learning algorithms, including deep learning, were adopted to predict the atypical features mainly buried in signal sequences of T3SEs, followed by development of a voting-based ensemble model integrating the individual prediction results. We assembled this into a unified T3SE prediction pipeline, T3SEpp, which integrated the results of individual modules, resulting in high accuracy (i.e., similar to 0.94) and >1-fold reduction in the false-positive rate compared to that of state-of-the-art software tools. The T3SEpp pipeline and sequence features observed here will facilitate the accurate identification of new T3SEs, with numerous benefits for future studies on host-pathogen interactions. IMPORTANCE Type III secreted effector (T3SE) prediction remains a big computational challenge. In practical applications, current software tools often suffer problems of high false-positive rates. One of the causal factors could be the relatively unitary type of biological features used for the design and training of the models. In this research, we made a comprehensive survey on the sequence-based features of T3SEs, including signal sequences, chaperone-binding domains, effector domains, and transcription factor binding promoter sites, and assembled a unified prediction pipeline integrating multi-aspect biological features within homology-based and multiple machine learning models. To our knowledge, we have compiled the most comprehensive biological sequence feature analysis for T3SEs in this research. The T3SEpp pipeline integrating the variety of features and assembling different models showed high accuracy, which should facilitate more accurate identification of T3SEs in new and existing bacterial whole-genome sequences.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] From prediction to function: Current practices and challenges towards the functional characterization of type III effectors
    De Ryck, Joren
    Van Damme, Petra
    Goormachtig, Sofie
    FRONTIERS IN MICROBIOLOGY, 2023, 14
  • [42] 2DCNNT3: Sequence Prediction based on 2DCNN for secret effector based bacterial Type III
    Sikander, Rahu
    Ghulam, Ali
    Swati, Zar Nawab Khan
    BIOSCIENCE RESEARCH, 2021, 18 (04): : 2574 - 2584
  • [43] An Experimental Pipeline for Initial Characterization of Bacterial Type III Secretion System Inhibitor Mode of Action Using Enteropathogenic Yersinia
    Morgan, Jessica M.
    Lam, Hanh N.
    Delgado, Jocelyn
    Luu, Justin
    Mohammadi, Sina
    Isberg, Ralph R.
    Wang, Helen
    Auerbuch, Victoria
    FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY, 2018, 8
  • [44] Secreted in a Type III Secretion System-Dependent Manner, EsaH and EscE Are the Cochaperones of the T3SS Needle Protein EsaG of Edwardsiella piscicida
    Zeng, Zhi Xiong
    Liu, Lu Yi
    Xiao, Shui Bing
    Lu, Jin Fang
    Liu, Ying Li
    Li, Jing
    Zhou, Yuan Ze
    Liao, Li Jing
    Li, Duan You
    Zhou, Ying
    Nie, Pin
    Xie, Hai Xia
    MBIO, 2022, 13 (04):
  • [45] T3_MM: A Markov Model Effectively Classifies Bacterial Type III Secretion Signals
    Wang, Yejun
    Sun, Ming'an
    Bao, Hongxia
    White, Aaron P.
    PLOS ONE, 2013, 8 (03):
  • [46] DeepT3 2.0: improving type III secreted effector predictions by an integrative deep learning framework
    Jing, Runyu
    Wen, Tingke
    Liao, Chengxiang
    Xue, Li
    Liu, Fengjuan
    Yu, Lezheng
    Luo, Jiesi
    NAR GENOMICS AND BIOINFORMATICS, 2021, 3 (04)
  • [47] T346Hunter: A Novel Web-Based Tool for the Prediction of Type III, Type IV and Type VI Secretion Systems in Bacterial Genomes
    Manuel Martinez-Garcia, Pedro
    Ramos, Cayo
    Rodriguez-Palenzuela, Pablo
    PLOS ONE, 2015, 10 (04):
  • [48] An ensemble method for multi-type Gram-negative bacterial secreted protein prediction by integrating different PSSM-based features
    Kong, L.
    Zhang, L.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2019, 30 (03) : 181 - 194
  • [49] Effectidor: an automated machine-learning-based web server for the prediction of type-III secretion system effectors
    Wagner, Naama
    Avram, Oren
    Gold-Binshtok, Dafna
    Zerah, Ben
    Teper, Doron
    Pupko, Tal
    BIOINFORMATICS, 2022, 38 (08) : 2341 - 2343
  • [50] Show me your secret(ed) weapons: a multifaceted approach reveals a wide arsenal of type III-secreted effectors in the cucurbit pathogenic bacterium Acidovorax citrulli and novel effectors in the Acidovorax genus
    Jiwenez-Guerrero, Irene
    Perez-Montano, Francisco
    Da Silva, Gustavo Mateus
    Wagner, Naama
    Shkedy, Dafna
    Zhao, Mei
    Pizarro, Lorena
    Bar, Maya
    Walcott, Ron
    Sessa, Guido
    Pupk, Tal
    Burdman, Saul
    MOLECULAR PLANT PATHOLOGY, 2020, 21 (01) : 17 - 37