Efficiently predicting high resolution mass spectra with graph neural networks

被引:0
作者
Murphy, Michael [1 ,2 ]
Jegelka, Stefanie [1 ]
Fraenkel, Ernest [2 ]
Kind, Tobias [3 ]
Healey, David [3 ]
Butler, Thomas [3 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Dept Biol Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] Enveda Biosci, Boulder, CO 80301 USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202 | 2023年 / 202卷
基金
加拿大自然科学与工程研究理事会;
关键词
SPECTROMETRY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying a small molecule from its mass spectrum is the primary open problem in computational metabolomics. This is typically cast as information retrieval: an unknown spectrum is matched against spectra predicted computationally from a large database of chemical structures. However, current approaches to spectrum prediction model the output space in ways that force a tradeoff between capturing high resolution mass information and tractable learning. We resolve this tradeoff by casting spectrum prediction as a mapping from an input molecular graph to a probability distribution over chemical formulas. We further discover that a large corpus of mass spectra can be closely approximated using a fixed vocabulary constituting only 2% of all observed formulas. This enables efficient spectrum prediction using an architecture similar to graph classification - GRAFF-MS - achieving significantly lower prediction error and greater retrieval accuracy than previous approaches.
引用
收藏
页数:14
相关论文
共 45 条
[1]   Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification [J].
Allen, Felicity ;
Greiner, Russ ;
Wishart, David .
METABOLOMICS, 2015, 11 (01) :98-110
[2]   Natural products in drug discovery: advances and opportunities [J].
Atanasov, Atanas G. ;
Zotchev, Sergey B. ;
Dirsch, Verena M. ;
Supuran, Claudiu T. .
NATURE REVIEWS DRUG DISCOVERY, 2021, 20 (03) :200-216
[3]   The current role of mass spectrometry in forensics and future prospects [J].
Brown, Hilary M. ;
McDaniel, Trevor J. ;
Fedick, Patrick W. ;
Mulligan, Christopher C. .
ANALYTICAL METHODS, 2020, 12 (32) :3974-3997
[4]  
Cai Tianle, 2021, P MACHINE LEARNING R, V139
[5]   MolDiscovery: learning mass spectrometry fragmentation of small molecules [J].
Cao, Liu ;
Guler, Mustafa ;
Tagirdzhanov, Azat ;
Lee, Yi-Yuan ;
Gurevich, Alexey ;
Mohimani, Hosein .
NATURE COMMUNICATIONS, 2021, 12 (01)
[6]   Illuminating the dark matter in metabolomics [J].
da Silva, Ricardo R. ;
Dorrestein, Pieter C. ;
Quinn, Robert A. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (41) :12549-12550
[7]   Mass spectrometry-based metabolomics [J].
Dettmer, Katja ;
Aronov, Pavel A. ;
Hammock, Bruce D. .
MASS SPECTROMETRY REVIEWS, 2007, 26 (01) :51-78
[8]   Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra [J].
Duehrkop, Kai .
BIOINFORMATICS, 2022, 38 (SUPPL 1) :342-349
[9]  
Duhrkop Kai, 2013, Algorithms in Bioinformatics. 13th International Workshop, WABI 2013. Proceedings: LNCS 8126, P45, DOI 10.1007/978-3-642-40453-5_5
[10]   Attention pooling-based convolutional neural network for sentence modelling [J].
Er, Meng Joo ;
Zhang, Yong ;
Wang, Ning ;
Pratama, Mahardhika .
INFORMATION SCIENCES, 2016, 373 :388-403