Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy

被引:126
作者
Chung, Yunsie [1 ]
Vermeire, Florence H. [1 ]
Wu, Haoyang [1 ]
Walker, Pierre J. [1 ,2 ]
Abraham, Michael H. [3 ]
Green, William H. [1 ]
机构
[1] MIT, Dept Chem Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Imperial Coll London, Dept Chem Engn, London SW7 2AZ, England
[3] UCL, Dept Chem, London WC1H 0AJ, England
基金
美国国家科学基金会;
关键词
PARTITION-COEFFICIENTS; GAS-PHASE; NEUTRAL MOLECULES; WATER; DESCRIPTORS; MODEL; DRY; SOLUBILITY; SOLVENTS; WET;
D O I
10.1021/acs.jcim.1c01103
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain while the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20,253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the three models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDBx, dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software packages, and source code.
引用
收藏
页码:433 / 446
页数:14
相关论文
共 86 条
[1]   Correlation and prediction of partition coefficients between the gas phase and water, and the solvents dodecane and undecane [J].
Abraham, MH ;
Acree, WE .
NEW JOURNAL OF CHEMISTRY, 2004, 28 (12) :1538-1543
[2]   Determination of sets of solute descriptors from chromatographic measurements [J].
Abraham, MH ;
Ibrahim, A ;
Zissimos, AM .
JOURNAL OF CHROMATOGRAPHY A, 2004, 1037 (1-2) :29-47
[3]   Partition of solutes into wet and dry ethers; an LFER analysis [J].
Abraham, MH ;
Zissimos, AM ;
Acree, WE .
NEW JOURNAL OF CHEMISTRY, 2003, 27 (07) :1041-1044
[4]   Partition of solutes from the gas phase and from water to wet and dry di-n-butyl ether:: a linear free energy relationship analysis [J].
Abraham, MH ;
Zissimos, AM ;
Acree, WE .
PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2001, 3 (17) :3732-3736
[5]   Henry's Law constants or air to water partition coefficients for 1,3,5-triazines by an LFER method [J].
Abraham, Michael H. ;
Enomoto, Kei ;
Clarke, Eric D. ;
Roses, Marti ;
Rafols, Clara ;
Fuguet, Elisabet .
JOURNAL OF ENVIRONMENTAL MONITORING, 2007, 9 (03) :234-239
[6]   Comparative analysis of solvation and selectivity in room temperature ionic liquids using the Abraham linear free energy relationship [J].
Abraham, Michael H. ;
Acree, William E., Jr. .
GREEN CHEMISTRY, 2006, 8 (10) :906-915
[7]   Estimation of enthalpies of sublimation of organic, organometallic and inorganic compounds [J].
Abraham, Michael H. ;
Acree, William E., Jr. .
FLUID PHASE EQUILIBRIA, 2020, 515
[8]   Descriptors for terpene esters from chromatographic and partition measurements: Estimation of human odor detection thresholds [J].
Abraham, Michael H. ;
Acree, William E., Jr. ;
Cometto-Muniz, J. Enrique .
JOURNAL OF CHROMATOGRAPHY A, 2020, 1609
[9]   Partition of Neutral Molecules and Ions from Water to o-Nitrophenyl Octyl Ether and of Neutral Molecules from the Gas Phase to o-Nitrophenyl Octyl Ether [J].
Abraham, Michael H. ;
Acree, William E., Jr. ;
Liu, Xiangli .
JOURNAL OF SOLUTION CHEMISTRY, 2018, 47 (02) :293-307
[10]   Equations for water-triolein partition coefficients for neutral species; comparison with other water-solvent partitions, and environmental and toxicological processes [J].
Abraham, Michael H. ;
Acree, William E., Jr. .
CHEMOSPHERE, 2016, 154 :48-54