Atomistic Descriptors for Machine Learning Models of Solubility Parameters for Small Molecules and Polymers

被引:12
作者
Chi, Mingzhe [1 ]
Gargouri, Rihab [2 ]
Schrader, Tim [1 ]
Damak, Kamel [2 ]
Maalej, Ramzi [2 ]
Sierka, Marek [1 ]
机构
[1] Friedrich Schiller Univ Jena, Otto Schott Inst Mat Res, D-07743 Jena, Germany
[2] Sfax Univ, Fac Sci Sfax, Georesources Mat Environm & Global Changes Lab GE, Sfax 3018, Tunisia
关键词
machine learning; polymer; properties prediction; KERNEL RIDGE-REGRESSION; SELECTION;
D O I
10.3390/polym14010026
中图分类号
O63 [高分子化学(高聚物)];
学科分类号
070305 ; 080501 ; 081704 ;
摘要
Descriptors derived from atomic structure and quantum chemical calculations for small molecules representing polymer repeat elements were evaluated for machine learning models to predict the Hildebrand solubility parameters of the corresponding polymers. Since reliable cohesive energy density data and solubility parameters for polymers are difficult to obtain, the experimental heat of vaporization & UDelta;H-vap of a set of small molecules was used as a proxy property to evaluate the descriptors. Using the atomistic descriptors, the multilinear regression model showed good accuracy in predicting & UDelta;H-vap of the small-molecule set, with a mean absolute error of 2.63 kJ/mol for training and 3.61 kJ/mol for cross-validation. Kernel ridge regression showed similar performance for the small-molecule training set but slightly worse accuracy for the prediction of & UDelta;H-vap of molecules representing repeating polymer elements. The Hildebrand solubility parameters of the polymers derived from the atomistic descriptors of the repeating polymer elements showed good correlation with values from the CROW polymer database.
引用
收藏
页数:11
相关论文
共 33 条
[1]   TRACELESS CARTESIAN TENSOR FORMS FOR SPHERICAL HARMONIC-FUNCTIONS - NEW THEOREMS AND APPLICATIONS TO ELECTROSTATICS OF DIELECTRIC MEDIA [J].
APPLEQUIST, J .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1989, 22 (20) :4303-4330
[2]   Hildebrand and Hansen solubility parameters from molecular dynamics with applications to electronic nose polymer sensors [J].
Belmares, M ;
Blanco, M ;
Goddard, WA ;
Ross, RB ;
Caldwell, G ;
Chou, SH ;
Pham, J ;
Olofson, PM ;
Thomas, C .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2004, 25 (15) :1814-1826
[3]   QUADRUPOLE MOMENTS OF SOME SIMPLE MOLECULES [J].
BUCKINGHAM, AD ;
DISCH, RL ;
DUNMUR, DA .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1968, 90 (12) :3104-+
[4]   Machine learning for molecular and materials science [J].
Butler, Keith T. ;
Davies, Daniel W. ;
Cartwright, Hugh ;
Isayev, Olexandr ;
Walsh, Aron .
NATURE, 2018, 559 (7715) :547-555
[5]   Calibration of Forcefields for Molecular Simulation: Sequential Design of Computer Experiments for Building Cost-Efficient Kriging Metamodels [J].
Cailliez, Fabien ;
Bourasseau, Arnaud ;
Pernot, Pascal .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2014, 35 (02) :130-149
[6]   Automatic selection of molecular descriptors using random forest: Application to drug discovery [J].
Cano, Gaspar ;
Garcia-Rodriguez, Jose ;
Garcia-Garcia, Alberto ;
Perez-Sanchez, Horacio ;
Benediktsson, Jon Atli ;
Thapa, Anil ;
Barr, Alastair .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 :151-159
[7]   Determining Hildebrand Solubility Parameter by Ultraviolet Spectroscopy and Microcalorimetry [J].
Carvalho, Suzanny P. ;
Lucas, Elizabete F. ;
Gonzalez, Gaspar ;
Spinelli, Luciana S. .
JOURNAL OF THE BRAZILIAN CHEMICAL SOCIETY, 2013, 24 (12) :1998-2007
[8]   Root mean square error (RMSE) or mean absolute error (MAE)? - Arguments against avoiding RMSE in the literature [J].
Chai, T. ;
Draxler, R. R. .
GEOSCIENTIFIC MODEL DEVELOPMENT, 2014, 7 (03) :1247-1250
[9]   NEW GROUP-CONTRIBUTION METHOD FOR ESTIMATING PROPERTIES OF PURE COMPOUNDS [J].
CONSTANTINOU, L ;
GANI, R .
AICHE JOURNAL, 1994, 40 (10) :1697-1710
[10]   Overfitting and undercomputing in machine learning [J].
Dietterich, T .
ACM COMPUTING SURVEYS, 1995, 27 (03) :326-327