Deep cross-modal learning between tandem mass spectrometry and molecular fingerprints for metabolite identification

被引:0
作者
Wang, Chaofu [1 ]
Xu, Ping [1 ]
Xue, Lingyun [1 ]
Liu, Yian [1 ]
Yan, Ming [1 ]
Chen, Anqi [2 ]
Hu, Shundi [2 ]
Wen, Luhong [2 ,3 ]
机构
[1] Hangzhou Dianzi Univ, Coll Automat, Hangzhou 310028, Peoples R China
[2] Ningbo Univ, Res Inst Adv Technol, Ningbo 315211, Peoples R China
[3] China Innovat Instrument Co Ltd, Ningbo 315000, Peoples R China
关键词
Metabolite annotation; Contrastive learning; Molecular fingerprints; Tandem mass spectra; PREDICTION;
D O I
10.1016/j.ijms.2024.117388
中图分类号
O64 [物理化学(理论化学)、化学物理学]; O56 [分子物理学、原子物理学];
学科分类号
070203 ; 070304 ; 081704 ; 1406 ;
摘要
Metabolite annotation plays a key role in metabolomics. To enable structural annotation of unknown tandem mass spectra, the prediction of molecular fingerprints using MS/MS is currently of great interest. However, current methods still present challenges in terms of redundancy and high dimensionality of fingerprint features, which can affect the accuracy and speed of annotation results. Therefore, we propose a dual-tower model structure consisting of an MS/MS feature extractor and a fingerprint feature extractor, which can directly compute the correlation between MS/MS and molecular fingerprints without needing to predict molecular fingerprints. Moreover, the fingerprint feature extractor, consisting of two MLPs, effectively reduces fingerprint redundancy. Both feature extractors are simultaneously optimized by contrastive learning. We trained and tested our method using data downloaded from the GNPS. The trained model was then used to search molecular structure databases such as PubChem. Experimental results show that our method outperforms MetFID, FingerScorer, MatFrag, DeepMass and CFM-ID in top-k evaluation.
引用
收藏
页数:7
相关论文
共 34 条
[1]  
[Anonymous], 2006, P IEEE COMP SOC C CO, P1735
[2]   Fast metabolite identification with Input Output Kernel Regression [J].
Brouard, Celine ;
Shen, Huibin ;
Duehrkop, Kai ;
d'Alche-Buc, Florence ;
Boecker, Sebastian ;
Rousu, Juho .
BIOINFORMATICS, 2016, 32 (12) :28-36
[3]   Advanced analytical and informatic strategies for metabolite annotation in untargeted metabolomics [J].
Cai, Yuping ;
Zhou, Zhiwei ;
Zhu, Zheng-Jiang .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2023, 158
[4]   ADAPTIVE: leArning DAta-dePendenT, concIse molecular VEctors for fast, accurate metabolite identification from tandem mass spectra [J].
Dai Hai Nguyen ;
Canh Hao Nguyen ;
Mamitsuka, Hiroshi .
BIOINFORMATICS, 2019, 35 (14) :I164-I172
[5]   Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches [J].
Dai Hai Nguyen ;
Canh Hao Nguyen ;
Mamitsuka, Hiroshi .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) :2028-2043
[6]   Searching molecular structure databases with tandem mass spectra using CSI:FingerID [J].
Duehrkop, Kai ;
Shen, Huibin ;
Meusel, Marvin ;
Rousu, Juho ;
Boecker, Sebastian .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (41) :12580-12585
[7]  
Fan ZL, 2019, IEEE INT C BIOINFORM, P244, DOI [10.1109/bibm47256.2019.8983190, 10.1109/BIBM47256.2019.8983190]
[8]   Convolutional Neural Network-Based Compound Fingerprint Prediction for Metabolite Annotation [J].
Gao, Shijinqiu ;
Chau, Hoi Yan Katharine ;
Wang, Kuijun ;
Ao, Hongyu ;
Varghese, Rency S. ;
Ressom, Habtom W. .
METABOLITES, 2022, 12 (07)
[9]  
Gao T., 2021, arXiv, DOI DOI 10.48550/ARXIV.2104.08821
[10]   METLIN: A Technology Platform for Identifying Knowns and Unknowns [J].
Guijas, Carlos ;
Montenegro-Burke, J. Rafael ;
Domingo-Almenara, Xavier ;
Palermo, Amelia ;
Warth, Benedikt ;
Hermann, Gerrit ;
Koellensperger, Gunda ;
Huan, Tao ;
Uritboonthai, Winnie ;
Aisporna, Aries E. ;
Wolan, Dennis W. ;
Spilker, Mary E. ;
Benton, H. Paul ;
Siuzdak, Gary .
ANALYTICAL CHEMISTRY, 2018, 90 (05) :3156-3164