Interpretable, Probability-Based Confidence Metric for Continuous Quantitative Structure-Activity Relationship Models

被引:33
作者
Keefer, Christopher E. [1 ]
Kauffman, Gregory W. [3 ]
Gupta, Rishi Raj [2 ]
机构
[1] Pfizer Inc, Computat ADME Grp, Dept Pharmacokinet Dynam & Drug Metab, Groton, CT 06340 USA
[2] Pfizer Inc, Res Ctr Emphasis, Groton, CT 06340 USA
[3] Pfizer Inc, Neurosci Res Unit, Worldwide Med Chem, Cambridge, MA 02139 USA
关键词
QSAR; APPLICABILITY; DOMAIN; SIMILARITY; SET;
D O I
10.1021/ci300554t
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A great deal of research has gone into the development of robust confidence in prediction and applicability domain (AD) measures for quantitative structure-activity relationship (QSAR) models in recent years. Much of the attention has historically focused on structural similarity, which can be defined in many forms and flavors. A concept that is frequently overlooked in the realm of the QSAR applicability domain is how the local activity landscape plays a role in how accurate a prediction is or is not. In this work, we describe an approach that pairs information about both the chemical similarity and activity landscape of a test compound's neighborhood into a single calculated confidence value. We also present an approach for converting this value into an interpretable confidence metric that has a simple and informative meaning across data sets. The approach will be introduced to the reader in the context of models built upon four diverse literature data sets. The steps we will outline include the definition of similarity used to determine nearest neighbors (NN), how we incorporate the NN activity landscape with a similarity-weighted root-mean-square distance (wRMSD) value, and how that value is then calibrated to generate an intuitive confidence metric for prospective application. Finally, we will illustrate the prospective performance of the approach on five proprietary models whose predictions and confidence metrics have been tracked for more than a year.
引用
收藏
页码:368 / 383
页数:16
相关论文
共 36 条
[1]  
[Anonymous], R LANG ENV STAT COMP
[2]  
[Anonymous], 2012, MOKA 1 1 0
[3]  
Barlow R.E., 1972, Statistical inference under order restrictions
[4]   How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR) [J].
Dearden, J. C. ;
Cronin, M. T. D. ;
Kaiser, K. L. E. .
SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2009, 20 (3-4) :241-266
[5]   A stepwise approach for defining the applicability domain of SAR and QSAR models [J].
Dimitrov, S ;
Dimitrova, G ;
Pavlov, T ;
Dimitrova, N ;
Patlewicz, G ;
Niemela, J ;
Mekenyan, O .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (04) :839-849
[6]   QSAR: dead or alive? [J].
Doweyko, Arthur M. .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2008, 22 (02) :81-89
[7]  
Doweyko AM, 2008, IDRUGS, V11, P894
[8]   Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research [J].
Fourches, Denis ;
Muratov, Eugene ;
Tropsha, Alexander .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (07) :1189-1204
[9]   Beware of q2! [J].
Golbraikh, A ;
Tropsha, A .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2002, 20 (04) :269-276
[10]   RHO-SIGMA-PI ANALYSIS . METHOD FOR CORRELATION OF BIOLOGICAL ACTIVITY + CHEMICAL STRUCTURE [J].
HANSCH, C ;
FUJITA, T .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1964, 86 (08) :1616-&