ProLuCID: An improved SEQUEST-like algorithm with enhanced sensitivity and specificity

被引:333
作者
Xu, T. [1 ,2 ]
Park, S. K. [1 ]
Venable, J. D. [1 ]
Wohlschlegel, J. A. [1 ]
Diedrich, J. K. [1 ]
Cociorva, D. [1 ]
Lu, B. [1 ]
Liao, L. [1 ]
Hewel, J. [1 ]
Han, X. [1 ]
Wong, C. C. L. [1 ]
Fonslow, B. [1 ]
Delahunty, C. [1 ]
Gao, Y. [1 ]
Shah, H. [1 ]
Yates, J. R., III [1 ]
机构
[1] Scripps Res Inst, Dept Physiol Chem, La Jolla, CA 92037 USA
[2] Dow AgroSci LLC, Indianapolis, IN 46268 USA
关键词
Proteomics; Identification; Mass spectrometry; Bioinformatics; Sequest; ProLuCID; TANDEM MASS-SPECTROMETRY; TRAP-ORBITRAP HYBRID; PROTEIN IDENTIFICATION; SPECTRAL DATA; ALIGNMENT ALGORITHM; SHOTGUN PROTEOMICS; DATABASE SEARCH; PEPTIDES; MS/MS; ACCURACY;
D O I
10.1016/j.jprot.2015.07.001
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
ProLuCID, a new algorithm for peptide identification using tandem mass spectrometry and protein sequence databases has been developed. This algorithm uses a three tier scoring scheme. First, a binomial probability is used as a preliminary scoring scheme to select candidate peptides. The binomial probability scores generated by ProLuCID minimize molecular weight bias and are independent of database size. A modified cross-correlation score is calculated for each candidate peptide identified by the binomial probability. This cross-correlation scoring function models the isotopic distributions of fragment ions of candidate peptides which ultimately results in higher sensitivity and specificity than that obtained with the SEQUEST XCorr. Finally, ProLuCID uses the distribution of XCorr values for all of the selected candidate peptides to compute a Z score for the peptide hit with the highest XCorr. The ProLuCID Z score combines the discriminative power of XCorr and DeltaCN, the standard parameters for assessing the quality of the peptide identification using SEQUEST, and displays significant improvement in specificity over ProLuCID XCorr alone. ProLuCID is also able to take advantage of high resolution MS/MS spectra leading to further improvements in specificity when compared to low resolution tandem MS data. A comparison of filtered data searched with SEQUEST and ProLuCID using the same false discovery rate as estimated by a target-decoy database strategy, shows that ProLuCID was able to identify as many as 25% more proteins than SEQUEST. ProLuCID is implemented in Java and can be easily installed on a single computer or a computer cluster. This article is part of a Special Issue entitled: Computational Proteomics. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:16 / 24
页数:9
相关论文
共 54 条
  • [1] Protein identification by spectral networks analysis
    Bandeira, Nuno
    Tsur, Dekel
    Frank, Ari
    Pevzner, Pavel A.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (15) : 6140 - 6145
  • [2] Search engine processor: Filtering and organizing peptide spectrum matches
    Carvalho, Paulo C.
    Fischer, Juliana S. G.
    Xu, Tao
    Cociorva, Daniel
    Balbuena, Tiago S.
    Valente, Richard H.
    Perales, Jonas
    Yates, John R., III
    Barbosa, Valmir C.
    [J]. PROTEOMICS, 2012, 12 (07) : 944 - 949
  • [3] YADA: a tool for taking the most out of high-resolution spectra
    Carvalho, Paulo C.
    Xu, Tao
    Han, Xuemei
    Cociorva, Daniel
    Barbosa, Valmir C.
    Yates, John R., III
    [J]. BIOINFORMATICS, 2009, 25 (20) : 2734 - 2736
  • [4] A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry
    Chen, T
    Kao, MY
    Tepel, M
    Rush, J
    Church, GM
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (03) : 325 - 337
  • [5] pNovo+: De Novo Peptide Sequencing Using Complementary HCD and ETD Tandem Mass Spectra
    Chi, Hao
    Chen, Haifeng
    He, Kun
    Wu, Long
    Yang, Bing
    Sun, Rui-Xiang
    Liu, Jianyun
    Zeng, Wen-Feng
    Song, Chun-Qing
    He, Si-Min
    Dong, Meng-Qiu
    [J]. JOURNAL OF PROTEOME RESEARCH, 2013, 12 (02) : 615 - 625
  • [6] pNovo: De novo Peptide Sequencing and Identification Using HCD Spectra
    Chi, Hao
    Sun, Rui-Xiang
    Yang, Bing
    Song, Chun-Qing
    Wang, Le-Heng
    Liu, Chao
    Fu, Yan
    Yuan, Zuo-Fei
    Wang, Hai-Peng
    He, Si-Min
    Dong, Meng-Qiu
    [J]. JOURNAL OF PROTEOME RESEARCH, 2010, 9 (05) : 2713 - 2724
  • [7] Role of accurate mass measurement (±10 ppm) in protein identification strategies employing MS or MS MS and database searching
    Clauser, KR
    Baker, P
    Burlingame, AL
    [J]. ANALYTICAL CHEMISTRY, 1999, 71 (14) : 2871 - 2882
  • [8] Cociorva D., 2007, CURRENT PROTOCOLS BI
  • [9] TANDEM: matching proteins with tandem mass spectra
    Craig, R
    Beavis, RC
    [J]. BIOINFORMATICS, 2004, 20 (09) : 1466 - 1467
  • [10] Faster SEQUEST Searching for Peptide Identification from Tandem Mass Spectra
    Diament, Benjamin J.
    Noble, William Stafford
    [J]. JOURNAL OF PROTEOME RESEARCH, 2011, 10 (09) : 3871 - 3879