On the role of gradients for machine learning of molecular energies and forces

被引:95
|
作者
Christensen, Anders S. [1 ]
Von Lilienfeld, O. Anatole [1 ]
机构
[1] Univ Basel, Dept Chem, Inst Phys Chem, Natl Ctr Computat Design & Discovery Novel Mat MA, Klingelbergst 80, CH-4056 Basel, Switzerland
来源
MACHINE LEARNING-SCIENCE AND TECHNOLOGY | 2020年 / 1卷 / 04期
基金
瑞士国家科学基金会; 欧洲研究理事会;
关键词
chemistry; machine learning; quantum mechanics; APPROXIMATION; POTENTIALS; SURFACES;
D O I
10.1088/2632-2153/abba6f
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The accuracy of any machine learning potential can only be as good as the data used in the fitting process. The most efficient model therefore selects the training data that will yield the highest accuracy compared to the cost of obtaining the training data. We investigate the convergence of prediction errors of quantum machine learning models for organic molecules trained on energy and force labels, two common data types in molecular simulations. When training models for the potential energy surface of a single molecule, we find that the inclusion of atomic forces in the training data increases the accuracy of the predicted energies and forces 7-fold, compared to models trained on energy only. Surprisingly, for models trained on sets of organic molecules of varying size and composition in non-equilibrium conformations, inclusion of forces in the training does not improve the predicted energies of unseen molecules in new conformations. Predicted forces, however, improve about 7-fold. For the systems studied, we find that force labels and energy labels contribute equally per label to the convergence of the prediction errors. The optimal choice of what type of training data to include depends on several factors: the computational cost of acquiring the force and energy labels for training, the application domain, the property of interest and the complexity of the machine learning model. Based on our observations we describe key considerations for the creation of new datasets for potential energy surfaces of molecules which maximize the efficiency of the resulting machine learning models.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Multifidelity Machine Learning for Molecular Excitation Energies
    Vinod, Vivin
    Maity, Sayan
    Zaspel, Peter
    Kleinekathoefer, Ulrich
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2023, 19 (21) : 7658 - 7670
  • [2] Modeling of molecular atomization energies using machine learning
    Matthias Rupp
    Alexandre Tkatchenko
    Klaus-Robert Müller
    O Anatole von Lilienfeld
    Journal of Cheminformatics, 4 (Suppl 1)
  • [3] Fast and accurate modeling of molecular energies with machine learning
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243
  • [4] Analytical gradients for molecular-orbital-based machine learning
    Lee, Sebastian J. R.
    Husch, Tamara
    Ding, Feizhi
    Miller, Thomas F.
    JOURNAL OF CHEMICAL PHYSICS, 2021, 154 (12):
  • [5] Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    PHYSICAL REVIEW LETTERS, 2012, 108 (05)
  • [6] Solvation Free Energies from Machine Learning Molecular Dynamics
    Bonnet, Nicephore
    Marzari, Nicola
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2024, 20 (11) : 4820 - 4823
  • [7] Comment on "Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning"
    Moussa, Jonathan E.
    PHYSICAL REVIEW LETTERS, 2012, 109 (05)
  • [8] Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies
    Hansen, Katja
    Montavon, Gregoire
    Biegler, Franziska
    Fazli, Siamac
    Rupp, Matthias
    Scheffler, Matthias
    von Lilienfeld, O. Anatole
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2013, 9 (08) : 3404 - 3419
  • [9] The Explanatory Role of Machine Learning in Molecular Biology
    Gross, Fridolin
    ERKENNTNIS, 2025, 90 (04) : 1583 - 1603
  • [10] Comment on "Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning" Reply
    Rupp, Matthias
    Tkatchenko, Alexandre
    Mueller, Klaus-Robert
    von Lilienfeld, O. Anatole
    PHYSICAL REVIEW LETTERS, 2012, 109 (05)