Benchmarking mass spectrometry based proteomics algorithms using a simulated database

被引:0
作者
Muaaz Gul Awan
Abdullah Gul Awan
Fahad Saeed
机构
[1] Lawrence Berkeley National Laboratory,Al
[2] University of Engineering & Technology (UET),Khwarizmi Institute of Computer Science (KICS)
[3] Florida International University,School of Computing and Information Sciences
来源
Network Modeling Analysis in Health Informatics and Bioinformatics | 2021年 / 10卷
关键词
Benchmarking; Peptide search algorithms; Proteomics; Mass-spectrometry;
D O I
暂无
中图分类号
学科分类号
摘要
Protein sequencing algorithms process data from a variety of instruments that has been generated under diverse experimental conditions. Currently there is no way to predict the accuracy of an algorithm for a given data set. Most of the published algorithms and associated software has been evaluated on limited number of experimental data sets. However, these performance evaluations do not cover the complete search space the algorithm and the software might encounter in real-world. To this end, we present a database of simulated spectra that can be used to benchmark any spectra to peptide search engine. We demonstrate the usability of this database by bench marking two popular peptide sequencing engines. We show wide variation in the accuracy of peptide deductions and a complete quality profile of a given algorithm can be useful for practitioners and algorithm developers. All benchmarking data is available at https://users.cs.fiu.edu/~fsaeed/Benchmark.html
引用
收藏
相关论文
共 92 条
[1]  
Aebersold R(2003)Mass spectrometry-based proteomics Nature 422 198-916
[2]  
Mann M(2011)Contour detection and hierarchical image segmentation IEEE Trans Pattern Anal Mach Intell 33 898-3879
[3]  
Arbelaez P(2011)Faster request searching for peptide identification from tandem mass spectra J Proteome Res 10 3871-3208
[4]  
Maire M(2015)Applications of targeted proteomics in systems biology and translational medicine Proteomics 15 3193-214
[5]  
Fowlkes C(2007)Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry Nat Methods 4 207-1526
[6]  
Malik J(2016)MS-reduce: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing Bioinformatics 32 1518-952
[7]  
Diament BJ(2018)Mass-simulator: a highly configurable simulator for generating ms/ms datasets for benchmarking of proteomics algorithms Proteomics 18 1800206-34
[8]  
Noble WS(2016)The proteome of primary prostate cancer Eur Urol 69 942-5392
[9]  
Ebhardt HA(2007)Semi-supervised learning for peptide identification from shotgun proteomics datasets Nat Methods 4 923-1894
[10]  
Root A(2008)Assigning significance to peptides identified by tandem mass spectrometry using decoy databases J Proteome Res 7 29-4491