Ad hoc learning of peptide fragmentation from mass spectra enables an interpretable detection of phosphorylated and cross-linked peptides

被引:9
作者
Altenburg, Tom [1 ,2 ]
Giese, Sven [1 ]
Wang, Shengbo [1 ,3 ]
Muth, Thilo [4 ]
Renard, Bernhard Y. [1 ]
机构
[1] Univ Potsdam, Digital Engn Fac, Hasso Plattner Inst Digital Engn, Potsdam, Germany
[2] Free Univ Berlin, Dept Math & Comp Sci, Berlin, Germany
[3] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Wellcome Trust Genome Campus, Hinxton, England
[4] BAM Fed Inst Mat Res & Testing, Berlin, Germany
关键词
SITE LOCALIZATION; MS/MS SPECTRA; IDENTIFICATION; KINASE; PHOSPHOPEPTIDES; PROTEOMICS; ALGORITHM; TARGET;
D O I
10.1038/s42256-022-00467-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fragmentation of peptides leaves characteristic patterns in mass spectrometry data, which can be used to identify protein sequences, but this method is challenging for mutated or modified sequences for which limited information exist. Altenburg et al. use an ad hoc learning approach to learn relevant patterns directly from unannotated fragmentation spectra. Mass spectrometry-based proteomics provides a holistic snapshot of the entire protein set of living cells on a molecular level. Currently, only a few deep learning approaches exist that involve peptide fragmentation spectra, which represent partial sequence information of proteins. Commonly, these approaches lack the ability to characterize less studied or even unknown patterns in spectra because of their use of explicit domain knowledge. Here, to elevate unrestricted learning from spectra, we introduce 'ad hoc learning of fragmentation' (AHLF), a deep learning model that is end-to-end trained on 19.2 million spectra from several phosphoproteomic datasets. AHLF is interpretable, and we show that peak-level feature importance values and pairwise interactions between peaks are in line with corresponding peptide fragments. We demonstrate our approach by detecting post-translational modifications, specifically protein phosphorylation based on only the fragmentation spectrum without a database search. AHLF increases the area under the receiver operating characteristic curve (AUC) by an average of 9.4% on recent phosphoproteomic data compared with the current state of the art on this task. Furthermore, use of AHLF in rescoring search results increases the number of phosphopeptide identifications by a margin of up to 15.1% at a constant false discovery rate. To show the broad applicability of AHLF, we use transfer learning to also detect cross-linked peptides, as used in protein structure analysis, with an AUC of up to 94%.
引用
收藏
页码:378 / +
页数:16
相关论文
共 63 条
[1]  
Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/2951913.2976746, 10.1145/3022670.2976746]
[2]   Mass-spectrometric exploration of proteome structure and function [J].
Aebersold, Ruedi ;
Mann, Matthias .
NATURE, 2016, 537 (7620) :347-355
[3]  
Altenburg T., ZENODO
[4]   The Kipoi repository accelerates community exchange and reuse of predictive models for genomics [J].
Avsec, Ziga ;
Kreuzhuber, Roman ;
Israeli, Johnny ;
Xu, Nancy ;
Cheng, Jun ;
Shrikumar, Avanti ;
Banerjee, Abhimanyu ;
Kim, Daniel S. ;
Beier, Thorsten ;
Urban, Lara ;
Kundaje, Anshul ;
Stegle, Oliver ;
Gagneur, Julien .
NATURE BIOTECHNOLOGY, 2019, 37 (06) :592-600
[5]  
Bai S., 2018, CoRR abs/1803.01271
[6]   A probability-based approach for high-throughput protein phosphorylation analysis and site localization [J].
Beausoleil, Sean A. ;
Villen, Judit ;
Gerber, Scott A. ;
Rush, John ;
Gygi, Steven P. .
NATURE BIOTECHNOLOGY, 2006, 24 (10) :1285-1292
[7]  
Bittremieux W., LEARNED EMBEDDING EF, DOI 10.1101/483263 (2022
[8]   spectrum_utils: A Python']Python Package for Mass Spectrometry Data Processing and Visualization [J].
Bittremieux, Wout .
ANALYTICAL CHEMISTRY, 2020, 92 (01) :659-661
[9]   Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing [J].
Bittremieux, Wout ;
Meysman, Pieter ;
Noble, William Stafford ;
Laukens, Kris .
JOURNAL OF PROTEOME RESEARCH, 2018, 17 (10) :3463-3474
[10]   Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment [J].
Cox, Juergen ;
Neuhauser, Nadin ;
Michalski, Annette ;
Scheltema, Richard A. ;
Olsen, Jesper V. ;
Mann, Matthias .
JOURNAL OF PROTEOME RESEARCH, 2011, 10 (04) :1794-1805