GT-NMR: a novel graph transformer-based approach for accurate prediction of NMR chemical shifts

被引:1
作者
Chen, Haochen [1 ]
Liang, Tao [1 ]
Tan, Kai [1 ]
Wu, Anan [1 ]
Lu, Xin [1 ]
机构
[1] Xiamen Univ, Coll Chem & Chem Engn, Fujian Prov Key Lab Theoret & Computat Chem, Dept Chem, Xiamen 361005, Peoples R China
来源
JOURNAL OF CHEMINFORMATICS | 2024年 / 16卷 / 01期
关键词
NMR chemical shifts; Machine learning; Graph transformer; Transformer; Graph neural network; Molecular complexity;
D O I
10.1186/s13321-024-00927-9
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In this work, inspired by the graph transformer, we presented an improved protocol, termed GT-NMR, which integrates 2D molecular graph representation with Transformer architecture, for accurate yet efficient prediction of NMR chemical shifts. The effectiveness of the GT-NMR was thoroughly examined with the standard nmrshiftdb2 dataset, 37 natural products and structural elucidation of 11 pairs of natural products. Systematical analysis affirms that GT-NMR outperforms traditional graph-based methods in all aspects, achieving state-of-the-art performance, with the mean absolute error of 0.158 and 1.189 ppm in predicting 1H and 13C NMR chemical shifts, respectively, for the standard nmrshiftdb2 dataset. Further scrutiny of its practical applications indicates that GT-NMR's efficacy is closely tied to molecular complexity, as quantified by the size-normalized spatial score (nSPS). For relatively simple molecules (nSPS < = 27.71), GT-NMR performs comparably to the best density functional while its effectiveness significantly diminishes with complex molecules characterized by higher nSPS values (nSPS > = 38.42). This trend is consistent across other graph-based NMR chemical shift prediction methods as well. Therefore, while employing GT-NMR or other graph-based methods for the rapid and routine prediction of NMR chemical shifts, it is advisable to utilize nSPS to assess their suitability. The source codes and trained model of GT-NMR are publicly available at GitHub. Scientific contribution GT-NMR, which combines the 2D molecular graph representation with the Transformer architecture, was implemented for the first time to predict atom-level NMR chemical shifts, achieving state-of-the-art performance. More importantly, the reliability of the GT-NMR and graph-based methods was assessed for the first time in terms of molecular complexity, as quantified by the size-normalized spacial score (nSPS). Systematical scrutiny demonstrated that GT-NMR offer a valuable way for routine application in structural screening and elucidation of relatively simple molecules.
引用
收藏
页数:10
相关论文
共 48 条
  • [41] MestRe Nova
    Willcott, Mark Robert
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2009, 131 (36) : 13180 - 13180
  • [42] Elucidating Structures of Complex Organic Compounds Using a Machine Learning Model Based on the 13C NMR Chemical Shifts
    Wu, Anan
    Ye, Qing
    Zhuang, Xiaowei
    Chen, Qiwen
    Zhang, Jinkun
    Wu, Jianming
    Xu, Xin
    [J]. PRECISION CHEMISTRY, 2023, 1 (01): : 57 - 68
  • [43] Ying C., 2021, arXiv
  • [44] You JX, 2021, Arxiv, DOI arXiv:2011.08843
  • [45] Zhang BH, 2024, Arxiv, DOI arXiv:2301.09505
  • [46] xOPBE: A Specialized Functional for Accurate Prediction of 13C Chemical Shifts
    Zhang, Jinkun
    Ye, Qing
    Yin, Chao
    Wu, Anan
    Xu, Xin
    [J]. JOURNAL OF PHYSICAL CHEMISTRY A, 2020, 124 (28) : 5824 - 5831
  • [47] Δ2 machine learning for reaction property prediction
    Zhao, Qiyuan
    Anstine, Dylan M.
    Isayev, Olexandr
    Savoie, Brett M.
    [J]. CHEMICAL SCIENCE, 2023, 14 (46) : 13392 - 13401
  • [48] A deep learning model for predicting selected organic molecular spectra
    Zou, Zihan
    Zhang, Yujin
    Liang, Lijun
    Wei, Mingzhi
    Leng, Jiancai
    Jiang, Jun
    Luo, Yi
    Hu, Wei
    [J]. NATURE COMPUTATIONAL SCIENCE, 2023, 3 (11): : 957 - +