MS-Rescue: A Computational Pipeline to Increase the Quality and Yield of Immunopeptidomics Experiments

被引:34
作者
Andreatta, Massimo [1 ]
Nicastri, Annalisa [2 ]
Peng, Xu [2 ]
Hancock, Gemma [2 ]
Dorrell, Lucy [2 ,3 ]
Ternette, Nicola [4 ]
Nielsen, Morten [1 ,5 ]
机构
[1] Univ Nacl San Mart, Inst Invest Biotecnol, Av 25 Mayo & Francia CP 1650, San Martin, Argentina
[2] Univ Oxford, Nuffield Dept Med, Oxford OX3 7BN, England
[3] Oxford NIHR Biomed Res Ctr, Oxford OX4 2PG, England
[4] Univ Oxford, Jenner Inst, Oxford OX3 7DQ, England
[5] Tech Univ Denmark, Dept Bio & Hlth Informat, DK-2800 Lyngby, Denmark
关键词
machine learning; mass spectrometry; MHC; peptidome; sequence motifs; PEPTIDE IDENTIFICATION; MASS; LIGAND; PREDICTION; PLATFORM;
D O I
10.1002/pmic.201800357
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
LC-MS/MS has become the standard platform for the characterization of immunopeptidomes, the collection of peptides naturally presented by major histocompatibility complex molecules to the cell surface. The protocols and algorithms used for immunopeptidomics data analysis are based on tools developed for traditional bottom-up proteomics that address the identification of peptides generated by tryptic digestion. Such algorithms are generally not tailored to the specific requirements of MHC ligand identification and, as a consequence, immunopeptidomics datasets suffer from dismissal of informative spectral information and high false discovery rates. Here, a new pipeline for the refinement of peptide-spectrum matches (PSM) is proposed, based on the assumption that immunopeptidomes contain a limited number of recurring peptide motifs, corresponding to MHC specificities. Sequence motifs are learned directly from the individual peptidome by training a prediction model on high-confidence PSMs. The model is then applied to PSM candidates with lower confidence, and sequences that score significantly higher than random peptides are rescued as likely true ligands. The pipeline is applied to MHC class I immunopeptidomes from three different species, and it is shown that it can increase the number of identified ligands by up to 20-30%, while effectively removing false positives and products of co-precipitation. Spectral validation using synthetic peptides confirms the identity of a large proportion of rescued ligands in the experimental peptidome.
引用
收藏
页数:7
相关论文
共 26 条
[1]   Computational Tools for the Identification and Interpretation of Sequence Motifs in Immunopeptidomes [J].
Alvarez, Bruno ;
Barra, Carolina ;
Nielsen, Morten ;
Andreatta, Massimo .
PROTEOMICS, 2018, 18 (12)
[2]  
Andreatta M, 2018, METHODS MOL BIOL, V1785, P269, DOI 10.1007/978-1-4939-7841-0_18
[3]   GibbsCluster: unsupervised clustering and alignment of peptide sequences [J].
Andreatta, Massimo ;
Alvarez, Bruno ;
Nielsen, Morten .
NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) :W458-W463
[4]   NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data [J].
Andreatta, Massimo ;
Schafer-Nielsen, Claus ;
Lund, Ole ;
Buus, Soren ;
Nielsen, Morten .
PLOS ONE, 2011, 6 (11)
[5]   Unsupervised HLA Peptidome Deconvolution Improves Ligand Prediction Accuracy and Predicts Cooperative Effects in Peptide-HLA Interactions [J].
Bassani-Sternberet, Michal ;
Gfellert, David .
JOURNAL OF IMMUNOLOGY, 2016, 197 (06) :2492-2499
[6]   Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry [J].
Bassani-Sternberg, Michal ;
Braunlein, Eva ;
Klar, Richard ;
Engleitner, Thomas ;
Sinitcyn, Pavel ;
Audehm, Stefan ;
Straub, Melanie ;
Weber, Julia ;
Slotta-Huspenina, Julia ;
Specht, Katja ;
Martignoni, Marc E. ;
Werner, Angelika ;
Hein, Rudiger ;
Busch, Dirk H. ;
Peschel, Christian ;
Rad, Roland ;
Cox, Jurgen ;
Mann, Matthias ;
Krackhardt, Angela M. .
NATURE COMMUNICATIONS, 2016, 7
[7]  
Caron E, 2015, MOL CELL PROTEOMICS, V14, P3105, DOI [10.1074/mcp.O115.052431, 10.1074/mcp.M115.052431]
[8]   MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification [J].
Cox, Juergen ;
Mann, Matthias .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1367-1372
[9]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[10]  
Falk K, 2006, J IMMUNOL, V177, P2741