Leveraging high-throughput molecular simulations and machine learning for the design of chemical mixtures

被引:0
作者
Chew, Alex K. [1 ]
Afzal, Mohammad Atif Faiz [2 ]
Kaplan, Zachary [1 ]
Collins, Eric M. [1 ]
Gattani, Suraj [1 ]
Misra, Mayank [1 ]
Chandrasekaran, Anand [1 ]
Leswing, Karl [1 ]
Halls, Mathew D. [3 ]
机构
[1] Schrodinger Inc, New York, NY 10036 USA
[2] Schrodinger Inc, Portland, OR USA
[3] Schrodinger Inc, San Diego, CA USA
关键词
NEURAL-NETWORK; PREDICTION;
D O I
10.1038/s41524-025-01552-2
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Mixtures of chemical ingredients, such as formulations, are ubiquitous in materials science, but optimizing their properties remains challenging due to the vast design space. Computational approaches offer a promising solution to traverse this space while minimizing trial-and-error experimentation. Using high-throughput classical molecular dynamics simulations, we generated a comprehensive dataset of over 30,000 solvent mixtures to evaluate three machine learning approaches that connect molecular structure and composition to property: formulation descriptor aggregation (FDA), formulation graph (FG), and Set2Set-based method (FDS2S). Our results demonstrate that our new FDS2S approach outperforms other approaches in predicting simulation-derived properties. Formulation-property relationships can reveal important substructures and identify promising formulations at least two to three times faster than random guessing. The models show robust transferability to experimental datasets, accurately predicting properties across energy, pharmaceutical, and petroleum applications. Our research demonstrates the utility of high-throughput simulations and machine learning tools to design formulations with promising properties.
引用
收藏
页数:16
相关论文
共 71 条
  • [41] OPLS4: Improving Force Field Accuracy on Challenging Regimes of Chemical Space
    Lu, Chao
    Wu, Chuanjie
    Ghoreishi, Delaram
    Chen, Wei
    Wang, Lingle
    Damm, Wolfgang
    Ross, Gregory A.
    Dahlgren, Markus K.
    Russell, Ellery
    Von Bargen, Christopher D.
    Abel, Robert
    Friesner, Richard A.
    Harder, Edward D.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2021, 17 (07) : 4291 - 4300
  • [42] Lundberg SM, 2017, ADV NEUR IN, V30
  • [43] From local explanations to global understanding with explainable AI for trees
    Lundberg, Scott M.
    Erion, Gabriel
    Chen, Hugh
    DeGrave, Alex
    Prutkin, Jordan M.
    Nair, Bala
    Katz, Ronit
    Himmelfarb, Jonathan
    Bansal, Nisha
    Lee, Su-In
    [J]. NATURE MACHINE INTELLIGENCE, 2020, 2 (01) : 56 - 67
  • [44] Molecular fingerprint-derived similarity measures for toxicological read-across: Recommendations for optimal use
    Mellor, C. L.
    Robinson, R. L. Marchese
    Benigni, R.
    Ebbrell, D.
    Enoch, S. J.
    Firman, J. W.
    Madden, J. C.
    Pawar, G.
    Yang, C.
    Cronin, M. T. D.
    [J]. REGULATORY TOXICOLOGY AND PHARMACOLOGY, 2019, 101 : 121 - 134
  • [45] Molnar C., 2020, Lulu
  • [46] Mordred: a molecular descriptor calculator
    Moriwaki, Hirotomo
    Tian, Yu-Shi
    Kawashita, Norihito
    Takagi, Tatsuya
    [J]. JOURNAL OF CHEMINFORMATICS, 2018, 10
  • [47] Paszke A, 2019, ADV NEUR IN, V32
  • [48] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
  • [49] Correlation of viscosities of pure liquids in a wide temperature range
    Qun-Fang, L
    Yu-Chun, H
    Rui-Sen, L
    [J]. FLUID PHASE EQUILIBRIA, 1997, 140 (1-2) : 221 - 231
  • [50] Interpretation of Compound Activity Predictions from Complex Machine Learning Models Using Local Approximations and Shapley Values
    Rodriguez-Perez, Raquel
    Bajorath, Juergen
    [J]. JOURNAL OF MEDICINAL CHEMISTRY, 2020, 63 (16) : 8761 - 8777