Three machine learning models for the 2019 Solubility Challenge

被引:11
作者
Mitchell, John B. O. [1 ,2 ]
机构
[1] Univ St Andrews, EaStCHEM Sch Chem, St Andrews KY16 9ST, Fife, Scotland
[2] Univ St Andrews, Biomed Sci Res Complex, St Andrews KY16 9ST, Fife, Scotland
关键词
Aqueous intrinsic solubility; Solubility prediction; Random Forest; Extra Trees; Bagging; Consensus classifiers; Wisdom of Crowds; Inter-laboratory error; INTRINSIC AQUEOUS SOLUBILITY; DRUG SOLUBILITY; RANDOM FOREST; FREE-ENERGY; PREDICTION; SOLVATION; DISCOVERY;
D O I
10.5599/admet.835
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.
引用
收藏
页码:215 / +
页数:37
相关论文
共 52 条
[1]   The correlation and prediction of the solubility of compounds in water using an amended solvation energy relationship [J].
Abraham, MH ;
Le, J .
JOURNAL OF PHARMACEUTICAL SCIENCES, 1999, 88 (09) :868-880
[2]   Guiding Lead Optimization for Solubility Improvement with Physics-Based Modeling [J].
Abramov, Yuriy A. ;
Sun, Guangxu ;
Zeng, Qiao ;
Zeng, Qun ;
Yang, Mingjun .
MOLECULAR PHARMACEUTICS, 2020, 17 (02) :666-673
[3]   Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database [J].
Avdeef, Alex .
ADMET AND DMPK, 2020, 8 (01) :29-77
[4]   Multi-lab intrinsic solubility measurement reproducibility in CheqSol and shake-flask methods [J].
Avdeef, Alex .
ADMET AND DMPK, 2019, 7 (03) :210-219
[5]  
Baek K., 2018, J PHARM SCI EMERG DR, V6, P1, DOI [DOI 10.4172/2380-9477.1000125, 10.4172/2380-9477.1000125]
[6]  
Bergmeir C., 2019, RSNNS R PACKAGE VERS
[7]   Accuracy of calculated pH-dependent aqueous drug solubility [J].
Bergström, CAS ;
Luthman, K ;
Artursson, P .
EUROPEAN JOURNAL OF PHARMACEUTICAL SCIENCES, 2004, 22 (05) :387-398
[8]   Global and local computational models for aqueous solubility prediction of drug-like molecules [J].
Bergström, CAS ;
Wassvik, CM ;
Norinder, U ;
Luthman, K ;
Artursson, P .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (04) :1477-1488
[9]   Computational prediction of drug solubility in water-based systems: Qualitative and quantitative approaches used in the current drug discovery and development setting [J].
Bergstrom, Christel A. S. ;
Larsson, Per .
INTERNATIONAL JOURNAL OF PHARMACEUTICS, 2018, 540 (1-2) :185-193
[10]   Can human experts predict solubility better than computers? [J].
Boobier, Samuel ;
Osbourn, Anne ;
Mitchell, John B. O. .
JOURNAL OF CHEMINFORMATICS, 2017, 9