The METLIN small molecule dataset for machine learning-based retention time prediction

被引:0
|
作者
Xavier Domingo-Almenara
Carlos Guijas
Elizabeth Billings
J. Rafael Montenegro-Burke
Winnie Uritboonthai
Aries E. Aisporna
Emily Chen
H. Paul Benton
Gary Siuzdak
机构
[1] The Scripps Research Institute,Scripps Center for Metabolomics
[2] The Scripps Research Institute,California Institute for Biomedical Research (Calibr)
[3] The Scripps Research Institute,Department of Integrative Structural and Computational Biology
[4] EURECAT – Technology Centre of Catalonia & Rovira i Virgili University joint unit,Centre for Omic Sciences
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning has been extensively applied in small molecule analysis to predict a wide range of molecular properties and processes including mass spectrometry fragmentation or chromatographic retention time. However, current approaches for retention time prediction lack sufficient accuracy due to limited available experimental data. Here we introduce the METLIN small molecule retention time (SMRT) dataset, an experimentally acquired reverse-phase chromatography retention time dataset covering up to 80,038 small molecules. To demonstrate the utility of this dataset, we deployed a deep learning model for retention time prediction applied to small molecule annotation. Results showed that in 70%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} of the cases, the correct molecular identity was ranked among the top 3 candidates based on their predicted retention time. We anticipate that this dataset will enable the community to apply machine learning or first principles strategies to generate better models for retention time prediction.
引用
收藏
相关论文
共 50 条
  • [31] Deep Learning-based Mammogram Classification using Small Dataset
    Adedigba, Adeyinka P.
    Adeshina, Steve A.
    Aibinu, Abiodun M.
    2019 15TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTER AND COMPUTATION (ICECCO), 2019,
  • [32] Machine learning-based prediction of processing time in furniture manufacturing to estimate lead time and pricing
    Masoumi, Abasali
    Bond, Brian H.
    EUROPEAN JOURNAL OF WOOD AND WOOD PRODUCTS, 2025, 83 (01)
  • [33] Progress of machine learning in the application of small molecule druggability prediction
    Li, Junyao
    Zhang, Jianmei
    Guo, Rui
    Dai, Jiawei
    Niu, Zhiqiang
    Wang, Yan
    Wang, Taoyun
    Jiang, Xiaojian
    Hu, Weicheng
    EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY, 2025, 285
  • [34] Machine Learning and Deep Learning-Based Students’ Grade Prediction
    Korchi A.
    Messaoudi F.
    Abatal A.
    Manzali Y.
    Operations Research Forum, 4 (4)
  • [35] Long-term prediction modeling of shallow rockburst with small dataset based on machine learning
    Rao, Guozhu
    Rao, Yunzhang
    Wan, Jiazheng
    Huang, Qiang
    Xie, Yangjun
    Lai, Qiande
    Yang, Zhihua
    Xiang, Run
    Zhang, Laiye
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [36] Machine-learning based prediction of small molecule-surface interaction potentials
    Rouse, Ian
    Lobaskin, Vladimir
    FARADAY DISCUSSIONS, 2023, 244 (00) : 306 - 335
  • [37] Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval
    Karisani, Payam
    Qin, Zhaohui S.
    Agichtein, Eugene
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2018,
  • [38] A dataset of pomegranate growth stages for machine learning-based monitoring and analysis
    Zhao, Jifei
    Almodfer, Rolla
    Wu, Xiaoying
    Wang, Xinfa
    DATA IN BRIEF, 2023, 50
  • [39] An open auscultation dataset for machine learning-based respiratory diagnosis studies
    Zhou, Guanyu
    Liu, Chengjian
    Li, Xiaoguang
    Liang, Sicong
    Wang, Ruichen
    Huang, Xun
    JASA EXPRESS LETTERS, 2024, 4 (05):
  • [40] Machine learning-based epoxy resin property prediction
    Jang, Huiwon
    Ryu, Dayoung
    Lee, Wonseok
    Park, Geunyeong
    Kim, Jihan
    MOLECULAR SYSTEMS DESIGN & ENGINEERING, 2024, 9 (09): : 959 - 968