Machine learning meets mechanistic modelling for accurate prediction of experimental activation energies

被引:132
|
作者
Jorner, Kjell [1 ]
Brinck, Tore [2 ]
Norrby, Per-Ola [3 ]
Buttar, David [1 ]
机构
[1] AstraZeneca, Early Chem Dev, Pharmaceut Sci, R&D, Macclesfield, Cheshire, England
[2] KTH Royal Inst Technol, Dept Chem, Appl Phys Chem, CBH, Stockholm, Sweden
[3] AstraZeneca, Data Sci & Modelling, Pharmaceut Sci, R&D, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
NUCLEOPHILIC-SUBSTITUTION; ELECTROSTATIC POTENTIALS; REACTIVITY; REGIOSELECTIVITY; CLASSIFICATION; EFFICIENT;
D O I
10.1039/d0sc04896h
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Accurate prediction of chemical reactions in solution is challenging for current state-of-the-art approaches based on transition state modelling with density functional theory. Models based on machine learning have emerged as a promising alternative to address these problems, but these models currently lack the precision to give crucial information on the magnitude of barrier heights, influence of solvents and catalysts and extent of regio- and chemoselectivity. Here, we construct hybrid models which combine the traditional transition state modelling and machine learning to accurately predict reaction barriers. We train a Gaussian Process Regression model to reproduce high-quality experimental kinetic data for the nucleophilic aromatic substitution reaction and use it to predict barriers with a mean absolute error of 0.77 kcal mol(-1) for an external test set. The model was further validated on regio- and chemoselectivity prediction on patent reaction data and achieved a competitive top-1 accuracy of 86%, despite not being trained explicitly for this task. Importantly, the model gives error bars for its predictions that can be used for risk assessment by the end user. Hybrid models emerge as the preferred alternative for accurate reaction prediction in the very common low-data situation where only 100-150 rate constants are available for a reaction class. With recent advances in deep learning for quickly predicting barriers and transition state geometries from density functional theory, we envision that hybrid models will soon become a standard alternative to complement current machine learning approaches based on ground-state physical organic descriptors or structural information such as molecular graphs or fingerprints.
引用
收藏
页码:1163 / 1175
页数:13
相关论文
共 50 条
  • [1] Machine Learning Enabling the Prediction of Activation Energies of SPAAC
    Josephson, Jason D.
    Pezacki, John Paul
    Nakajima, Masaya
    JOURNAL OF PHYSICAL ORGANIC CHEMISTRY, 2025, 38 (02)
  • [2] Fast and accurate modeling of molecular energies with machine learning
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [3] Accurate prediction of binding energies for two-dimensional catalytic materials using machine learning
    Melisande Fischer, Julia
    Hunter, Michelle
    Hankel, Marlies
    Searles, Debra J.
    Parker, Amanda J.
    Barnard, Amanda S.
    CHEMCATCHEM, 2020, 12 (20) : 5109 - 5120
  • [4] Machine Learning-Based Prediction of Activation Energies for Chemical Reactions on Metal Surfaces
    Hutton, Daniel J.
    Cordes, Kari E.
    Michel, Carine
    Goltl, Florian
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (19) : 6006 - 6013
  • [5] Machine learning versus linear regression modelling approach for accurate ozone concentrations prediction
    Jumin, Ellysia
    Zaini, Nuratiah
    Ahmed, Ali Najah
    Abdullah, Samsuri
    Ismail, Marzuki
    Sherif, Mohsen
    Sefelnasr, Ahmed
    EI-Shafie, Ahmed
    ENGINEERING APPLICATIONS OF COMPUTATIONAL FLUID MECHANICS, 2020, 14 (01) : 713 - 725
  • [6] Machine learning activation energies of chemical reactions
    Lewis-Atwell, Toby
    Townsend, Piers A.
    Grayson, Matthew N.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2022, 12 (04)
  • [7] Image-based modelling for Adolescent Idiopathic Scoliosis: Mechanistic machine learning analysis and prediction
    Tajdari, Mahsa
    Pawar, Aishwarya
    Li, Hengyang
    Tajdari, Farzam
    Maqsood, Ayesha
    Cleary, Emmett
    Saha, Sourav
    Zhang, Yongjie Jessica
    Sarwark, John F.
    Liu, Wing Kam
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 374
  • [8] Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    PHYSICAL REVIEW LETTERS, 2012, 108 (05)
  • [9] Accurate prediction of in vivo protein abundances by coupling constraint-based modelling and machine learning
    Moura Ferreira M.A.D.
    Wendering P.
    Arend M.
    Batista da Silveira W.
    Nikoloski Z.
    Metabolic Engineering, 2023, 80 : 184 - 192
  • [10] Bond Type Restricted Property Weighted Radial Distribution Functions for Accurate Machine Learning Prediction of Atomization Energies
    Krykunov, Mykhaylo
    Woo, Tom K.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2018, 14 (10) : 5229 - 5237