Do you know your r2?

被引:19
作者
Avdeef, Alex [1 ]
机构
[1] In ADME Res, 1732 First Ave 102, New York, NY 10128 USA
关键词
coefficient of determination; linear correction coefficient; root-mean-square error; linear regression; EVALUATE; MODELS;
D O I
10.5599/admet.888
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The prediction of solubility of drugs usually calls on the use of several open-source/commercially-available computer programs in the various calculation steps. Popular statistics to indicate the strength of the prediction model include the coefficient of determination (r(2)), Pearson's linear correlation coefficient (r(Pearson)), and the root-mean-square error (RMSE), among many others. When a program calculates these statistics, slightly different definitions may be used. This commentary briefly reviews the definitions of three types of r2 and RMSE statistics (model validation, bias compensation, and Pearson) and how systematic errors due to shortcomings in solubility prediction models can be differently indicated by the choice of statistical indices. The indices we have employed in recently published papers on the prediction of solubility of druglike molecules were unclear, especially in cases of drugs from 'beyond the Rule of 5' chemical space, as simple prediction models showed distinctive 'bias-tilt' systematic type scatter. (c) 2021 by the authors. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:69 / 74
页数:6
相关论文
共 7 条
[1]   Can small drugs predict the intrinsic aqueous solubility of 'beyond Rule of 5' big drugs? [J].
Avdeef, Alex ;
Kansy, Manfred .
ADMET AND DMPK, 2020, 8 (03) :180-+
[2]   Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database [J].
Avdeef, Alex .
ADMET AND DMPK, 2020, 8 (01) :29-77
[3]   Real External Predictivity of QSAR Models: How To Evaluate It? Comparison of Different Validation Criteria and Proposal of Using the Concordance Correlation Coefficient [J].
Chirico, Nicola ;
Gramatica, Paola .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2011, 51 (09) :2320-2335
[4]   Findings of the Challenge To Predict Aqueous Solubility [J].
Hopfinger, Anton J. ;
Esposito, Emilio Xavier ;
Llinas, A. ;
Glen, R. C. ;
Goodman, J. M. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (01) :1-5
[5]   Solubility Challenge Revisited after Ten Years, with Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and Loose (SD ∼ 0.62 log) Test Sets [J].
Llinas, Antonio ;
Avdeef, Alex .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (06) :3036-3040
[6]   How to evaluate models:: Observed vs. predicted or predicted vs. observed? [J].
Pineiro, Gervasio ;
Perelman, Susana ;
Guerschman, Juan P. ;
Paruelo, Jose M. .
ECOLOGICAL MODELLING, 2008, 216 (3-4) :316-322
[7]  
Walters W.P., 2014, CHEMOINFORMATICS DRU, P1, DOI [10.1002/9781118742785.ch1, DOI 10.1002/9781118742785.CH1]