DeepSAT: Learning Molecular Structures from Nuclear Magnetic Resonance Data

被引:15
作者
Kim, Hyun Woo [1 ,2 ]
Zhang, Chen [1 ,3 ]
Reher, Raphael [1 ,4 ]
Wang, Mingxun [5 ,6 ,7 ]
Alexander, Kelsey L. [1 ,8 ]
Nothias, Louis-Felix [9 ]
Han, Yoo Kyong [10 ]
Shin, Hyeji [10 ]
Lee, Ki Yong [1 ,10 ]
Lee, Kyu Hyeong [2 ]
Kim, Myeong Ji [2 ]
Dorrestein, Pieter C. [5 ]
Gerwick, William H. [1 ,5 ]
Cottrell, Garrison W. [3 ]
机构
[1] Univ Calif San Diego, Scripps Inst Oceanog, Ctr Marine Biotechnol & Biomed, La Jolla, CA 92093 USA
[2] Dongguk Univ Seoul, Integrated Res Inst Drug Dev, Coll Pharm, Seoul, Gyeonggi Do, South Korea
[3] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92093 USA
[4] Univ Marburg, Inst Pharmaceut Biol & Biotechnol, Marburg, Germany
[5] Univ Calif San Diego, Skaggs Sch Pharm & Pharmaceut Sci, La Jolla, CA 92093 USA
[6] Ometa Labs LLC, San Diego, CA USA
[7] Univ Calif Riverside, Dept Comp Sci, Riverside, CA USA
[8] Univ Calif San Diego, Dept Chem & Biochem, La Jolla, CA USA
[9] Univ Cote Azur, Inst Chim Nice, UMR 7272, CNRS, F-06108 Nice, France
[10] Korea Univ, Coll Pharm, Sejong, South Korea
基金
美国国家卫生研究院; 新加坡国家研究基金会;
关键词
Convolutional neural network; Nuclear magnetic resonance; Structure prediction; STRUCTURE ELUCIDATION; MASS-SPECTROMETRY; NMR DATABASE; METABOLOMICS; DISCOVERY;
D O I
10.1186/s13321-023-00738-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The identification of molecular structure is essential for understanding chemical diversity and for developing drug leads from small molecules. Nevertheless, the structure elucidation of small molecules by Nuclear Magnetic Resonance (NMR) experiments is often a long and non-trivial process that relies on years of training. To achieve this process efficiently, several spectral databases have been established to retrieve reference NMR spectra. However, the number of reference NMR spectra available is limited and has mostly facilitated annotation of commercially available derivatives. Here, we introduce DeepSAT, a neural network-based structure annotation and scaffold prediction system that directly extracts the chemical features associated with molecular structures from their NMR spectra. Using only the H-1-C-13 HSQC spectrum, DeepSAT identifies related known compounds and thus efficiently assists in the identification of molecular structures. DeepSAT is expected to accelerate chemical and biomedical research by accelerating the identification of molecular structures.
引用
收藏
页数:12
相关论文
共 42 条
  • [41] MetaboMiner - semi-automated identification of metabolites from 2D NMR spectra of complex biofluids
    Xia, Jianguo
    Bjorndahl, Trent C.
    Tang, Peter
    Wishart, David S.
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [42] Small Molecule Accurate Recognition Technology (SMART) to Enhance Natural Products Research
    Zhang, Chen
    Idelbayev, Yerlan
    Roberts, Nicholas
    Tao, Yiwen
    Nannapaneni, Yashwanth
    Duggan, Brendan M.
    Min, Jie
    Lin, Eugene C.
    Gerwick, Erik C.
    Cottrell, Garrison W.
    Gerwick, William H.
    [J]. SCIENTIFIC REPORTS, 2017, 7