Evaluating de novo sequencing in proteomics: already an accurate alternative to database-driven peptide identification?

被引:75
作者
Muth, Thilo [1 ]
Renard, Bernhard Y. [1 ]
机构
[1] Robert Koch Inst, Bioinformat, Berlin, Germany
关键词
de novo peptide sequencing; benchmarking study; bioinformatics; tandem mass spectrometry; HCD; CID; peptide identification; sequence tags; TANDEM MASS-SPECTROMETRY; PROTEIN IDENTIFICATION; SHOTGUN PROTEOMICS; COMPUTER-PROGRAM; TOP-DOWN; TOOL; MS/MS; PERFORMANCE; ALGORITHMS; SOFTWARE;
D O I
10.1093/bib/bbx033
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
While peptide identifications in mass spectrometry (MS)-based shotgun proteomics are mostly obtained using database search methods, high-resolution spectrum data from modern MS instruments nowadays offer the prospect of improving the performance of computational de novo peptide sequencing. The major benefit of de novo sequencing is that it does not require a reference database to deduce full-length or partial tag-based peptide sequences directly from experimental tandem mass spectrometry spectra. Although various algorithms have been developed for automated de novo sequencing, the prediction accuracy of proposed solutions has been rarely evaluated in independent benchmarking studies. The main objective of this work is to provide a detailed evaluation on the performance of de novo sequencing algorithms on high-resolution data. For this purpose, we processed four experimental data sets acquired from different instrument types from collision-induced dissociation and higher energy collisional dissociation (HCD) fragmentation mode using the software packages Novor, PEAKS and PepNovo. Moreover, the accuracy of these algorithms is also tested on ground truth data based on simulated spectra generated from peak intensity prediction software. We found that Novor shows the overall best performance compared with PEAKS and PepNovo with respect to the accuracy of correct full peptide, tag-based and single-residue predictions. In addition, the same tool outpaced the commercial competitor PEAKS in terms of running time speedup by factors of around 12-17. Despite around 35% prediction accuracy for complete peptide sequences on HCD data sets, taken as a whole, the evaluated algorithms perform moderately on experimental data but show a significantly better performance on simulated data (up to 84% accuracy). Further, we describe the most frequently occurring de novo sequencing errors and evaluate the influence of missing fragment ion peaks and spectral noise on the accuracy. Finally, we discuss the potential of de novo sequencing for now becoming more widely used in the field.
引用
收藏
页码:954 / 970
页数:17
相关论文
共 102 条
[1]  
Allmer J, 2011, EXPERT REV PROTEOMIC, V8, P645, DOI [10.1586/EPR.11.54, 10.1586/epr.11.54]
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Antilope-A Lagrangian Relaxation Approach to the de novo Peptide Sequencing Problem [J].
Andreotti, Sandro ;
Klau, Gunnar W. ;
Reinert, Knut .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (02) :385-394
[4]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[5]   How to discriminate between leucine and isoleucine by low energy ESI-TRAP MSn [J].
Armirotti, Andrea ;
Millo, Enrico ;
Damonte, Gianluca .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2007, 18 (01) :57-63
[6]   In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics [J].
Audain, Enrique ;
Uszkoreit, Julian ;
Sachsenberg, Timo ;
Pfeuffer, Julianus ;
Liang, Xiao ;
Hermjakob, Henning ;
Sanchez, Aniel ;
Eisenacher, Martin ;
Reinert, Knut ;
Tabb, David L. ;
Kohlbacher, Oliver ;
Perez-Riverol, Yasset .
JOURNAL OF PROTEOMICS, 2017, 150 :170-182
[7]   Automated de novo protein sequencing of monoclonal antibodies [J].
Bandeira, Nuno ;
Pham, Victoria ;
Pevzner, Pavel ;
Arnott, David ;
Lill, Jennie R. .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1336-1338
[8]   De novo peptide sequencing by tandem MS using complementary CID and electron transfer dissociation [J].
Bertsch, Andreas ;
Leinenbach, Andreas ;
Pervukhin, Anton ;
Lubeck, Markus ;
Hartmer, Ralf ;
Baessmann, Carsten ;
Elnakady, Yasser Abbas ;
Mueller, Rolf ;
Boecker, Sebastian ;
Huber, Christian G. ;
Kohlbacher, Oliver .
ELECTROPHORESIS, 2009, 30 (21) :3736-3747
[9]   De Novo Sequencing and Resurrection of a Human Astrovirus-Neutralizing Antibody [J].
Bogdanoff, Walter A. ;
Morgenstern, David ;
Bern, Marshall ;
Ueberheide, Beatrix M. ;
Sanchez-Fauquier, Alicia ;
DuBois, Rebecca M. .
ACS INFECTIOUS DISEASES, 2016, 2 (05) :313-321
[10]   A comparative study of the accuracy of several de novo sequencing software packages for datasets derived by matrix-assisted laser desorption/ionisation and electrospray [J].
Bringans, Scott ;
Kendrick, Tulene S. ;
Lui, James ;
Lipscombe, Richard .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2008, 22 (21) :3450-3454