A Decoy-Free Approach to the Identification of Peptides

被引:29
作者
Gonnelli, Giulia [1 ,2 ]
Stock, Michiel [3 ]
Verwaeren, Jan [3 ]
Maddelein, Davy [1 ,2 ]
De Baets, Bernard [3 ]
Martens, Lennart [1 ,2 ]
Degroeve, Sven [1 ,2 ]
机构
[1] VIB, Dept Med Prot Res, B-9000 Ghent, Belgium
[2] Univ Ghent, Dept Biochem, B-9000 Ghent, Belgium
[3] Univ Ghent, Dept Math Modelling Stat & Bioinformat, B-9000 Ghent, Belgium
关键词
peptide identification; decoy databases; machine learning; TANDEM MASS-SPECTRA; DATABASE SEARCH; STATISTICAL-MODEL; SPECTROMETRY; PROTEOMICS; PROTEOGENOMICS; MS/MS; METAPROTEOMICS; CONFIDENCE; PROTEINS;
D O I
10.1021/pr501164r
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A growing number of proteogenomics and metaproteomics studies indicate potential limitations of the application of the decoy database paradigm used to separate correct peptide identifications from incorrect ones in traditional shotgun proteomics. We therefore propose a binary classifier called Nokoi that allows fast yet reliable decoy-free separation of correct from incorrect peptide-to-spectrum matches (PSMs). Nokoi was trained on a very large collection of heterogeneous data using ranks supplied by the Mascot search engine to label correct and incorrect PSMs. We show that Nokoi outperforms Mascot and achieves a performance very close to that of Percolator at substantially higher processing speeds.
引用
收藏
页码:1792 / 1798
页数:7
相关论文
共 32 条
  • [1] Mass spectrometry-based proteomics
    Aebersold, R
    Mann, M
    [J]. NATURE, 2003, 422 (6928) : 198 - 207
  • [2] Ansong Charles, 2008, Briefings in Functional Genomics & Proteomics, V7, P50, DOI 10.1093/bfgp/eln010
  • [3] An Integrated Mass-Spectrometry Pipeline Identifies Novel Protein Coding-Regions in the Human Genome
    Bitton, Danny A.
    Smith, Duncan L.
    Connolly, Yvonne
    Scutt, Paul J.
    Miller, Crispin J.
    [J]. PLOS ONE, 2010, 5 (01):
  • [4] Addressing Statistical Biases in Nucleotide-Derived Protein Databases for Proteogenomic Search Strategies
    Blakeley, Paul
    Overton, Ian M.
    Hubbard, Simon J.
    [J]. JOURNAL OF PROTEOME RESEARCH, 2012, 11 (11) : 5221 - 5234
  • [5] Accurate and Sensitive Peptide Identification with Mascot Percolator
    Brosch, Markus
    Yu, Lu
    Hubbard, Tim
    Choudhary, Jyoti
    [J]. JOURNAL OF PROTEOME RESEARCH, 2009, 8 (06) : 3176 - 3181
  • [6] Proteogenomics to discover the full coding content of genomes: A computational perspective
    Castellana, Natalie
    Bafna, Vineet
    [J]. JOURNAL OF PROTEOMICS, 2010, 73 (11) : 2124 - 2135
  • [7] Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling
    Choi, Hyungwon
    Ghosh, Debashis
    Nesvizhskii, Alexey I.
    [J]. JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) : 286 - 292
  • [8] Analysis of the Resolution Limitations of Peptide Identification Algorithms
    Colaert, Niklaas
    Degroeve, Sven
    Helsens, Kenny
    Martens, Lennart
    [J]. JOURNAL OF PROTEOME RESEARCH, 2011, 10 (12) : 5555 - 5561
  • [9] TANDEM: matching proteins with tandem mass spectra
    Craig, R
    Beavis, RC
    [J]. BIOINFORMATICS, 2004, 20 (09) : 1466 - 1467
  • [10] Faster SEQUEST Searching for Peptide Identification from Tandem Mass Spectra
    Diament, Benjamin J.
    Noble, William Stafford
    [J]. JOURNAL OF PROTEOME RESEARCH, 2011, 10 (09) : 3871 - 3879