Evaluation of signal peptide prediction algorithms for identification of mycobacterial signal peptides using sequence data from proteomic methods

被引:27
|
作者
Leversen, Nils Anders [1 ]
de Souza, Gustavo A. [1 ]
Malen, Hiwa [1 ]
Prasad, Swati [2 ,3 ]
Jonassen, Inge [2 ,3 ]
Wiker, Harald G. [1 ,4 ]
机构
[1] Univ Bergen, Gade Inst, Microbiol & Immunol Sect, N-5021 Bergen, Norway
[2] Univ Bergen, BCCS, Dept Informat, N-5020 Bergen, Norway
[3] Univ Bergen, BCCS, Computat Biol Unit, N-5020 Bergen, Norway
[4] Haukeland Hosp, Dept Microbiol & Immunol, N-5021 Bergen, Norway
来源
MICROBIOLOGY-SGM | 2009年 / 155卷
关键词
COMPLETE GENOME SEQUENCE; GRAM-NEGATIVE BACTERIA; BOVIS BCG; CLEAVAGE SITES; GEL-ELECTROPHORESIS; TUBERCULOSIS H37RV; MASS-SPECTROMETRY; PROTEINS; SECRETION; ACID;
D O I
10.1099/mic.0.025270-0
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Secreted proteins play an important part in the pathogenicity of Mycobacterium tuberculosis, and are the primary source of vaccine and diagnostic candidates. A majority of these proteins are exported via the signal peptidase I-dependent pathway, and have a signal peptide that is cleaved off during the secretion process. Sequence similarities within signal peptides have spurred the development of several algorithms for predicting their presence as well as the respective cleavage sites. For proteins exported via this pathway, algorithms exist for eukaryotes, and for Gram-negative and Gram-positive bacteria. However, the unique structure of the mycobacterial membrane raises the question of whether the existing algorithms are suitable for predicting signal peptides within mycobacterial proteins. In this work, we have evaluated the performance of nine signal peptide prediction algorithms on a positive validation set, consisting of 57 proteins with a verified signal peptide and cleavage site, and a negative set, consisting of 61 proteins that have an N-terminal sequence that confirms the annotated translational start site. We found the hidden I model of SignaIP v3.0 to be the best-performing algorithm for predicting the presence of a signal peptide in mycobacterial proteins. It predicted no false positives or false negatives, and predicted a correct cleavage site for 45 of the 57 proteins in the positive set. Based on these results, we used the hidden I model of SignaIP v3.0 to analyse the 10 available annotated proteomes of mycobacterial species, including annotations of M. tuberculosis H37Rv from the Wellcome Trust Sanger Institute and the J. Craig Venter Institute (JCVI). When excluding proteins with transmembrane regions among the proteins predicted to harbour a signal peptide, we found between 7.8 and 10.5% of the proteins in the proteomes to be putative secreted proteins. Interestingly, we observed a consistent difference in the percentage of predicted proteins between the Sanger Institute and JCVI. We have determined the most valuable algorithm for predicting signal peptidase I-processed proteins of M. tuberculosis, and used this algorithm to estimate the number of mycobacterial proteins with the potential to be exported via this pathway.
引用
收藏
页码:2375 / 2383
页数:9
相关论文
共 30 条
  • [1] A comparison of signal sequence prediction methods using a test set of signal peptides
    Menne, KML
    Hermjakob, H
    Apweiler, R
    BIOINFORMATICS, 2000, 16 (08) : 741 - 742
  • [2] Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data
    Chamrad, DC
    Körting, G
    Stühler, K
    Meyer, HE
    Klose, J
    Blüggel, M
    PROTEOMICS, 2004, 4 (03) : 619 - 628
  • [3] Identification of antimicrobial peptides from teleosts and anurans in expressed sequence tag databases using conserved signal sequences
    Tessera, Valentina
    Guida, Filomena
    Juretic, Davor
    Tossi, Alessandro
    FEBS JOURNAL, 2012, 279 (05) : 724 - 736
  • [4] A comparison of data mining algorithms using different drug classifications for signal identification
    Dang, Vivian T.
    Kortepeter, Cindy M.
    Munoz, Monica A.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2021, 30 : 120 - 121
  • [5] Signal Peptide Efficiency: From High-Throughput Data to Prediction and Explanation
    Grasso, Stefano
    Dabene, Valentina
    Hendriks, Margriet M. W. B.
    Zwartjens, Priscilla
    Pellaux, Rene
    Held, Martin
    Panke, Sven
    van Dijl, Jan Maarten
    Meyer, Andreas
    van Rij, Tjeerd
    ACS SYNTHETIC BIOLOGY, 2023, : 390 - 404
  • [6] Identification and analysis of the cleavage site in a signal peptide using SMOTE, dagging, and feature selection methods
    Wang, ShaoPeng
    Wang, Deling
    Li, JiaRui
    Huang, Tao
    Cai, Yu-Dong
    MOLECULAR OMICS, 2018, 14 (01) : 64 - 73
  • [7] Combined prediction of transmembrane topology and signal peptide of β-barrel proteins: Using a hidden Markov model and genetic algorithms
    Zou, Lingyun
    Wang, Zhengzhi
    Wang, Yongxian
    Hua, Fuquan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2010, 40 (07) : 621 - 628
  • [8] Signal Strength and Read Rate Prediction Modeling Using Machine Learning Algorithms for Vehicular Access Control and Identification
    Priyashman, Vimal
    Ismail, Widad
    IEEE SENSORS JOURNAL, 2019, 19 (04) : 1400 - 1411
  • [9] Evaluation of MLP and LSTM ANNs for Signal Prediction in a Tunnel at 5.8 GHz Using Measurement Data
    Vieira, Pedro A.
    Matos, Leni J.
    Castellanos, Pedro V. Gonzalez
    2024 19TH INTERNATIONAL SYMPOSIUM ON WIRELESS COMMUNICATION SYSTEMS, ISWCS 2024, 2024, : 286 - 291
  • [10] Evaluation of the application of sequence data to the identification of outbreaks of disease using anomaly detection methods
    Diaz-Cao, Jose Manuel
    Liu, Xin
    Kim, Jeonghoon
    Clavijo, Maria Jose
    Martinez-Lopez, Beatriz
    VETERINARY RESEARCH, 2023, 54 (01) : 75