Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0

被引:259
|
作者
The, Matthew [1 ]
MacCoss, Michael J. [2 ]
Noble, William S. [2 ,3 ]
Kall, Lukas [1 ]
机构
[1] KTH Royal Inst Technol, Sci Life Lab, Sch Biotechnol, Box 1031, S-17121 Solna, Sweden
[2] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
Mass spectrometry - LC-MS/MS; Statistical analysis; Data processing and analysis; Protein inference; Large scale studies; TANDEM MASS-SPECTROMETRY; SHOTGUN PROTEOMICS; PEPTIDE IDENTIFICATION; SPECTRA; PROBABILITIES; DATABASES; INFERENCE; STRIKE;
D O I
10.1007/s13361-016-1460-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches ( PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/under an Apache 2.0 license.
引用
收藏
页码:1719 / 1727
页数:9
相关论文
共 31 条
  • [1] How to talk about protein-level false discovery rates in shotgun proteomics
    The, Matthew
    Tasnim, Ayesha
    Kall, Lukas
    PROTEOMICS, 2016, 16 (18) : 2461 - 2469
  • [2] A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics
    Halloran, John T.
    Rocke, David M.
    JOURNAL OF PROTEOME RESEARCH, 2018, 17 (05) : 1978 - 1982
  • [3] A Scalable Approach for Protein False Discovery Rate Estimation in Large Proteomic Data Sets
    Savitski, Mikhail M.
    Wilhelm, Mathias
    Hahne, Hannes
    Kuster, Bernhard
    Bantscheff, Marcus
    MOLECULAR & CELLULAR PROTEOMICS, 2015, 14 (09) : 2394 - 2404
  • [4] Application of de Novo Sequencing to Large-Scale Complex Proteomics Data Sets
    Devabhaktuni, Arun
    Elias, Joshua E.
    JOURNAL OF PROTEOME RESEARCH, 2016, 15 (03) : 732 - 742
  • [5] Secure discovery of genetic relatives across large-scale and distributed genomic data sets
    Hong, Matthew M.
    Froelicher, David
    Magner, Ricky
    Popic, Victoria
    Berger, Bonnie
    Cho, Hyunghoon
    GENOME RESEARCH, 2024, 34 (09) : 1312 - 1323
  • [6] KYSS: Mass spectrometry data quality assessment for protein analysis and large-scale proteomics
    Such-Sanmartin, Gerard
    Sidoli, Simone
    Ventura-Espejo, Estela
    Jensen, Ole N.
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2014, 445 (04) : 702 - 707
  • [7] Weighted False Discovery Rate Control in Large-Scale Multiple Testing
    Basu, Pallavi
    Cai, T. Tony
    Das, Kiranmoy
    Sun, Wenguang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (523) : 1172 - 1183
  • [8] QuartetS: a fast and accurate algorithm for large-scale orthology detection
    Yu, Chenggang
    Zavaljevski, Nela
    Desai, Valmik
    Reifman, Jaques
    NUCLEIC ACIDS RESEARCH, 2011, 39 (13) : e88
  • [9] Improving large-scale proteomics by clustering of mass spectrometry data
    Beer, I
    Barnea, E
    Ziv, T
    Admon, A
    PROTEOMICS, 2004, 4 (04) : 950 - 960
  • [10] Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses
    Rosenberger, George
    Bludau, Isabell
    Schmitt, Uwe
    Heusel, Moritz
    Hunter, Christie L.
    Liu, Yansheng
    MacCoss, Michael J.
    MacLean, Brendan X.
    Nesvizhskii, Alexey I.
    Pedrioli, Patrick G. A.
    Reiter, Lukas
    Rost, Hannes L.
    Tate, Stephen
    Ting, Ying S.
    Collins, Ben C.
    Aebersold, Ruedi
    NATURE METHODS, 2017, 14 (09) : 921 - +