A learned embedding for efficient joint analysis of millions of mass spectra

被引:30
作者
Bittremieux, Wout [1 ]
May, Damon H. [2 ]
Bilmes, Jeffrey [3 ,4 ]
Noble, William Stafford [2 ,4 ]
机构
[1] Univ Calif San Diego, Skaggs Sch Pharm & Pharmaceut Sci, La Jolla, CA 92093 USA
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[4] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
PROTEOMICS; IDENTIFICATION;
D O I
10.1038/s41592-022-01496-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods that aim to exploit publicly available mass spectrometry repositories rely primarily on unsupervised clustering of spectra. Here we trained a deep neural network in a supervised fashion on the basis of previous assignments of peptides to spectra. The network, called 'GLEAMS', learns to embed spectra in a low-dimensional space in which spectra generated by the same peptide are close to one another. We applied GLEAMS for large-scale spectrum clustering, detecting groups of unidentified, proximal spectra representing the same peptide. We used these clusters to explore the dark proteome of repeatedly observed yet consistently unidentified mass spectra. GLEAMS, a deep learning-based algorithm, embeds mass spectra such that spectra related to the same peptide are close to each other, enabling unknown spectra to be identified on a massive scale.
引用
收藏
页码:675 / +
页数:18
相关论文
共 58 条
[21]   The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience [J].
Griss, Johannes ;
Jones, Andrew R. ;
Sachsenberg, Timo ;
Walzer, Mathias ;
Gatto, Laurent ;
Hartler, Juergen ;
Thallinger, Gerhard G. ;
Salek, Reza M. ;
Steinbeck, Christoph ;
Neuhauser, Nadin ;
Cox, Juergen ;
Neumann, Steffen ;
Fan, Jun ;
Reisinger, Florian ;
Xu, Qing-Wei ;
del Toro, Noemi ;
Perez-Riverol, Yasset ;
Ghali, Fawaz ;
Bandeira, Nuno ;
Xenarios, Ioannis ;
Kohlbacher, Oliver ;
Vizcaino, Juan Antonio ;
Hermjakob, Henning .
MOLECULAR & CELLULAR PROTEOMICS, 2014, 13 (10) :2765-2775
[22]   PRIDE Cluster: building a consensus of proteomics data [J].
Griss, Johannes ;
Foster, Joseph M. ;
Hermjakob, Henning ;
Vizcaino, Juan Antonio .
NATURE METHODS, 2013, 10 (02) :95-96
[23]  
Hadsell R., 2006, P 2006 IEEE COMPUTER, DOI DOI 10.1109/CVPR.2006.100
[24]   Array programming with NumPy [J].
Harris, Charles R. ;
Millman, K. Jarrod ;
van der Walt, Stefan J. ;
Gommers, Ralf ;
Virtanen, Pauli ;
Cournapeau, David ;
Wieser, Eric ;
Taylor, Julian ;
Berg, Sebastian ;
Smith, Nathaniel J. ;
Kern, Robert ;
Picus, Matti ;
Hoyer, Stephan ;
van Kerkwijk, Marten H. ;
Brett, Matthew ;
Haldane, Allan ;
del Rio, Jaime Fernandez ;
Wiebe, Mark ;
Peterson, Pearu ;
Gerard-Marchant, Pierre ;
Sheppard, Kevin ;
Reddy, Tyler ;
Weckesser, Warren ;
Abbasi, Hameer ;
Gohlke, Christoph ;
Oliphant, Travis E. .
NATURE, 2020, 585 (7825) :357-362
[25]   Reconstructing kinase network topologies from phosphoproteomics data reveals cancer-associated rewiring [J].
Hijazi, Maruan ;
Smith, Ryan ;
Rajeeve, Vinothini ;
Bessant, Conrad ;
Cutillas, Pedro R. .
NATURE BIOTECHNOLOGY, 2020, 38 (04) :493-+
[26]  
Hirschberg J., 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLPCoNLL), P410
[27]   Kernel methods in machine learning [J].
Hofmann, Thomas ;
Schoelkopf, Bernhard ;
Smola, Alexander J. .
ANNALS OF STATISTICS, 2008, 36 (03) :1171-1220
[28]   ThermoRawFileParser: Modular, Scalable, and Cross-Platform RAW File Conversion [J].
Hulstaert, Niels ;
Shofstahl, Jim ;
Sachsenberg, Timo ;
Walzer, Mathias ;
Barsnes, Harald ;
Martens, Lennart ;
Perez-Riverol, Yasset .
JOURNAL OF PROTEOME RESEARCH, 2020, 19 (01) :537-542
[29]   Matplotlib: A 2D graphics environment [J].
Hunter, John D. .
COMPUTING IN SCIENCE & ENGINEERING, 2007, 9 (03) :90-95
[30]   The BioPlex Network: A Systematic Exploration of the Human Interactome [J].
Huttlin, Edward L. ;
Ting, Lily ;
Bruckner, Raphael J. ;
Gebreab, Fana ;
Gygi, Melanie P. ;
Szpyt, John ;
Tam, Stanley ;
Zarraga, Gabriela ;
Colby, Greg ;
Baltier, Kurt ;
Dong, Rui ;
Guarani, Virginia ;
Vaites, Laura Pontano ;
Ordureau, Alban ;
Rad, Ramin ;
Erickson, Brian K. ;
Wuehr, Martin ;
Chick, Joel ;
Zhai, Bo ;
Kolippakkam, Deepak ;
Mintseris, Julian ;
Obar, Robert A. ;
Harris, Tim ;
Artavanis-Tsakonas, Spyros ;
Sowa, Mathew E. ;
De Camilli, Pietro ;
Paulo, Joao A. ;
Harper, J. Wade ;
Gygi, Steven P. .
CELL, 2015, 162 (02) :425-440