Protein Probability Model for High-Throughput Protein Identification by Mass Spectrometry-Based Proteomics

被引:7
作者
Prieto, Gorka [1 ]
Vazquez, Jesus [2 ]
机构
[1] Univ Basque Country UPV EHU, Dept Commun Engn, Bilbao 48013, Spain
[2] Ctr Nacl Invest Cardiovasc Carlos III CNIC, Madrid 28049, Spain
关键词
FDR; proteomics; protein identification; target-decoy approach; DISCOVERY RATE ESTIMATION; PEPTIDE IDENTIFICATION; SHOTGUN PROTEOMICS; STATISTICAL-MODEL; SEARCH STRATEGY; RATES; CONFIDENCE; DRAFTS;
D O I
10.1021/acs.jproteome.9b00819
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
yy Shotgun proteomics is the method of choice for high-throughput protein identification; however, robust statistical methods are essential to automatize this task while minimizing the number of false identifications. The standard method for estimating the false discovery rate (FDR) of individual identifications and keeping it below a threshold (typically 1%) is the target-decoy approach. However, numerous works have shown that FDR at the protein level may become much larger than FDR at the peptide level. The development of an appropriate scoring model to identify proteins from their peptides using high-throughput shotgun proteomics is highly needed. In this study, we present a novel protein-level scoring algorithm that uses the scores of the identified peptides and maintains all of the properties expected for a true protein probability. We also present a refinement of the picked method to calculate FDR at the protein level. These algorithms can be used together as a robust identification workflow suitable for large-scale proteomics, and we show that the identification performance of this workflow is superior to that of other widely used methods in several samples and using different search engines. Our protein probability model offers the scientific community an algorithm that is easy to integrate into protein identification workflows for the automated analysis of shotgun proteomics data.
引用
收藏
页码:1285 / 1297
页数:13
相关论文
共 42 条
[1]   Mass spectrometry-based protein identification with accurate statistical significance assignment [J].
Alves, Gelio ;
Yu, Yi-Kuo .
BIOINFORMATICS, 2015, 31 (05) :699-706
[2]   In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics [J].
Audain, Enrique ;
Uszkoreit, Julian ;
Sachsenberg, Timo ;
Pfeuffer, Julianus ;
Liang, Xiao ;
Hermjakob, Henning ;
Sanchez, Aniel ;
Eisenacher, Martin ;
Reinert, Knut ;
Tabb, David L. ;
Kohlbacher, Oliver ;
Perez-Riverol, Yasset .
JOURNAL OF PROTEOMICS, 2017, 150 :170-182
[3]   A cross-platform toolkit for mass spectrometry and proteomics [J].
Chambers, Matthew C. ;
Maclean, Brendan ;
Burke, Robert ;
Amodei, Dario ;
Ruderman, Daniel L. ;
Neumann, Steffen ;
Gatto, Laurent ;
Fischer, Bernd ;
Pratt, Brian ;
Egertson, Jarrett ;
Hoff, Katherine ;
Kessner, Darren ;
Tasman, Natalie ;
Shulman, Nicholas ;
Frewen, Barbara ;
Baker, Tahmina A. ;
Brusniak, Mi-Youn ;
Paulse, Christopher ;
Creasy, David ;
Flashner, Lisa ;
Kani, Kian ;
Moulding, Chris ;
Seymour, Sean L. ;
Nuwaysir, Lydia M. ;
Lefebvre, Brent ;
Kuhlmann, Frank ;
Roark, Joe ;
Rainer, Paape ;
Detlev, Suckau ;
Hemenway, Tina ;
Huhmer, Andreas ;
Langridge, James ;
Connolly, Brian ;
Chadick, Trey ;
Holly, Krisztina ;
Eckels, Josh ;
Deutsch, Eric W. ;
Moritz, Robert L. ;
Katz, Jonathan E. ;
Agus, David B. ;
MacCoss, Michael ;
Tabb, David L. ;
Mallick, Parag .
NATURE BIOTECHNOLOGY, 2012, 30 (10) :918-920
[4]   Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling [J].
Choi, Hyungwon ;
Ghosh, Debashis ;
Nesvizhskii, Alexey I. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :286-292
[5]   MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification [J].
Cox, Juergen ;
Mann, Matthias .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1367-1372
[6]  
Devroye L., 1986, NONUNIFORM RANDOM VA
[7]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[8]   The potential clinical impact of the release of two drafts of the human proteome [J].
Ezkurdia, Iakes ;
Calvo, Enrique ;
Del Pozo, Angela ;
Vazquez, Jesus ;
Valencia, Alfonso ;
Tress, Michael L. .
EXPERT REVIEW OF PROTEOMICS, 2015, 12 (06) :579-593
[9]   Analyzing the First Drafts of the Human Proteome [J].
Ezkurdia, Iakes ;
Vazquez, Jesus ;
Valencia, Alfonso ;
Tress, Michael .
JOURNAL OF PROTEOME RESEARCH, 2014, 13 (08) :3854-3855
[10]   A High-Confidence Human Plasma Proteome Reference Set with Estimated Concentrations in PeptideAtlas [J].
Farrah, Terry ;
Deutsch, Eric W. ;
Omenn, Gilbert S. ;
Campbell, David S. ;
Sun, Zhi ;
Bletz, Julie A. ;
Mallick, Parag ;
Katz, Jonathan E. ;
Malmstroem, Johan ;
Ossola, Reto ;
Watts, Julian D. ;
Lin, Biaoyang ;
Zhang, Hui ;
Moritz, Robert L. ;
Aebersold, Ruedi .
MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (09)