Tandem Mass Spectrum Identification via Cascaded Search

被引:51
作者
Kertesz-Farkas, Attila [1 ]
Keith, Uri [2 ]
Noble, William Stafford [3 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Sydney, Sch Math & Stat, Camperdown, NSW 2006, Australia
[3] Univ Washington, Dept Genome Sci, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
Peptide assignment; spectrum identification; FDR control; FALSE DISCOVERY RATE; MS-GF PLUS; SHOTGUN PROTEOMICS; PROTEIN IDENTIFICATIONS; PEPTIDE IDENTIFICATION; POSTTRANSLATIONAL MODIFICATIONS; P-VALUES; DATABASE; CONFIDENCE; ACCURATE;
D O I
10.1021/pr501173s
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Accurate assignment of peptide sequences to observed fragmentation spectra is hindered by the large number of hypotheses that must be considered for each observed spectrum. A high score assigned to a particular peptide spectrum match (PSM) may not end up being statistically significant after multiple testing correction. Researchers can mitigate this problem by controlling the hypothesis space in various ways: considering only peptides resulting from enzymatic cleavages, ignoring possible post-translational modifications or single nucleotide variants, etc. However, these strategies sacrifice identifications of spectra generated by rarer types of peptides. In this work, we introduce a statistical testing framework, cascade search, that directly addresses this problem. The method requires that the user specify a priori a statistical confidence threshold as well as a series of peptide databases. For instance, such a cascade of databases could include fully tryptic, semitryptic, and nonenzymatic peptides or peptides with increasing numbers of modifications. Cascaded search then gradually expands the list of candidate peptides from more likely peptides toward rare peptides, sequestering at each stage any spectrum that is identified with a specified statistical confidence. We compare cascade search to a standard procedure that lumps all of the peptides into a single database, as well as to a previously described group FDR procedure that computes the FDR separately within each database. We demonstrate, using simulated and real data, that cascade search identifies more spectra at a fixed FDR threshold than with either the ungrouped or grouped approach. Cascade search thus provides a general method for maximizing the number of identified spectra in a statistically rigorous fashion.
引用
收藏
页码:3027 / 3038
页数:12
相关论文
共 29 条
[1]   Improving Peptide Identification Sensitivity in Shotgun Proteomics by Stratification of Search Space [J].
Alves, Gelio ;
Yu, Yi-Kuo .
JOURNAL OF PROTEOME RESEARCH, 2013, 12 (06) :2571-2581
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]   Accurate and Sensitive Peptide Identification with Mascot Percolator [J].
Brosch, Markus ;
Yu, Lu ;
Hubbard, Tim ;
Choudhary, Jyoti .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (06) :3176-3181
[4]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[5]   SIMULTANEOUS INFERENCE: WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED? [J].
Efron, Bradley .
ANNALS OF APPLIED STATISTICS, 2008, 2 (01) :197-223
[6]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[7]   Validated MALDI-TOF/TOF mass spectra for protein standards [J].
Falkner, Jayson A. ;
Kachman, Maureen ;
Veine, Donna M. ;
Walker, Angela ;
Strahler, John R. ;
Andrews, Philip C. .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2007, 18 (05) :850-855
[8]   A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes [J].
Fenyö, D ;
Beavis, RC .
ANALYTICAL CHEMISTRY, 2003, 75 (04) :768-774
[9]   Transferred Subgroup False Discovery Rate for Rare Post-translational Modifications Detected by Mass Spectrometry* [J].
Fu, Yan ;
Qian, Xiaohong .
MOLECULAR & CELLULAR PROTEOMICS, 2014, 13 (05) :1359-1368
[10]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964