Rapid and accurate peptide identification from tandem mass spectra

被引:142
|
作者
Park, Christopher Y. [1 ]
Klammer, Aaron A. [1 ]
Kaell, Lukas [1 ]
MacCoss, Michael J. [1 ]
Noble, William S. [1 ,2 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
mass spectrometry; peptide identification; proteomics; bioinformatics;
D O I
10.1021/pr800127y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Mass spectrometry, the core technology in the field of proteomics, promises to enable scientists to identify and quantify the entire complement of proteins in a complex biological sample. Currently, the primary bottleneck in this type of experiment is computational. Existing algorithms for interpreting mass spectra are slow and fail to identify a large proportion of the given spectra. We describe a database search program called Crux that reimplements and extends the widely used database search program SEQUEST. For speed, Crux uses a peptide indexing scheme to rapidly retrieve candidate peptides for a given spectrum. For each peptide in the target database, Crux generates shuffled decoy peptides on the fly, providing a good null model and, hence, accurate false discovery rate estimates. Crux also implements two recently described postprocessing methods: a p value calculation based upon fitting a Weibull distribution to the observed scores, and a semisupervised method that learns to discriminate between target and decoy matches. Both methods significantly improve the overall rate of peptide identification. Crux is implemented in C and is distributed with source code freely to noncommercial users.
引用
收藏
页码:3022 / 3027
页数:6
相关论文
共 50 条
  • [31] Protein Identification from Tandem Mass Spectra with Probabilistic Language Modeling
    Yang, Yiming
    Harpale, Abhay
    Ganapathy, Subramaniam
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 554 - 569
  • [32] InsPecT: Identification of posttransiationally modified peptides from tandem mass spectra
    Tanner, S
    Shu, HJ
    Frank, A
    Wang, LC
    Zandi, E
    Mumby, M
    Pevzner, PA
    Bafna, V
    ANALYTICAL CHEMISTRY, 2005, 77 (14) : 4626 - 4639
  • [33] Rapid identification of disaccharides by tandem mass spectrometry
    Kuki, Akos
    Szabo, Katalin E.
    Nagy, Lajos
    Zsuga, Miklos
    Keki, Sandor
    JOURNAL OF MASS SPECTROMETRY, 2013, 48 (12): : 1276 - 1280
  • [34] HMMatch: Peptide identification by spectral matching of tandem mass spectra using hidden Markov models
    Wu, Xue
    Tseng, Chau-Wen
    Edwards, Nathan
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2007, 14 (08) : 1025 - 1043
  • [35] Combining Percolator with X!Tandem for Accurate and Sensitive Peptide Identification
    Xu, Mingguo
    Li, Zhendong
    Li, Liang
    JOURNAL OF PROTEOME RESEARCH, 2013, 12 (06) : 3026 - 3033
  • [36] ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra
    Dai Hai Nguyen
    Canh Hao Nguyen
    Mamitsuka, Hiroshi
    BIOINFORMATICS, 2019, 35 (14) : I164 - I172
  • [37] Rapid metabolite identification using accurate and MSn mass spectra in combination with smart processing tools
    Zurek, G
    Herzog, WD
    Germanus, A
    Räther, O
    Krone, V
    Ingendoh, A
    Bässmann, C
    LC GC EUROPE, 2004, : 24 - 25
  • [38] Strategies of peptide identification using tandem Mass Spectrometry
    El Jadid, Sara
    Touahni, Raja
    Moussa, Ahmed
    PROCEEDINGS OF THE SECOND CONFERENCE OF THE MOROCCAN CLASSIFICATION SOCIETY: NEW CHALLENGES IN DATA SCIENCES (SMC '2019), 2019, : 36 - 40
  • [39] A statistical approach to peptide identification from clustered tandem mass spectrometry data
    Ryu, Soyoung
    Goodlett, David R.
    Noble, William S.
    Minin, Vladimir N.
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [40] Identification of Tandem Mass Spectra of Mixtures of Isomeric Peptides
    Chen, Xi
    Drogaris, Paul
    Bern, Marshall
    JOURNAL OF PROTEOME RESEARCH, 2010, 9 (06) : 3270 - 3279