Introducing Conformal Prediction in Predictive Modeling. A Transparent and Flexible Alternative to Applicability Domain Determination

被引:133
作者
Norinder, Ulf [1 ]
Carlsson, Lars [2 ]
Boyer, Scott [2 ,4 ]
Eklund, Martin [2 ,3 ]
机构
[1] H Lundbeck & Co AS, DK-2500 Valby, Denmark
[2] AstraZeneca Res & Dev, SE-43183 Molndal, Sweden
[3] Univ Calif San Francisco, Dept Surg, San Francisco, CA 94115 USA
[4] Swedish Toxicol Sci Res Ctr, SE-15136 Sodertalje, Sweden
关键词
QSAR MODELS; RANDOM FOREST;
D O I
10.1021/ci5001168
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Conformal prediction is introduced as an alternative approach to domain applicability estimation. The advantages of using conformal prediction are as follows: First, the approach is based on a consistent and well-defined mathematical framework. Second, the understanding of the confidence level concept in conformal predictions is straightforward, e.g. a confidence level of 0.8 means that the conformal predictor will commit, at most, 20% errors (i.e., true values outside the assigned prediction range). Third, the confidence level can be varied depending on the situation where the model is to be applied and the consequences of such changes are readily understandable, i.e. prediction ranges are increased or decreased, and the changes can immediately be inspected. We demonstrate the usefulness of conformal prediction by applying it to 10 publicly available data sets.
引用
收藏
页码:1596 / 1603
页数:8
相关论文
共 25 条
[1]  
Bassan A., 2007, COMPUTATIONAL TOXICO, P751
[2]   Comparison of approaches for estimating reliability of individual regression predictions [J].
Bosnic, Zoran ;
Kononenko, Igor .
DATA & KNOWLEDGE ENGINEERING, 2008, 67 (03) :504-516
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Contemporary QSAR classifiers compared [J].
Bruce, Craig L. ;
Melville, James L. ;
Pickett, Stephen D. ;
Hirst, Jonathan D. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2007, 47 (01) :219-227
[5]   Beyond the Scope of Free-Wilson Analysis: Building Interpretable QSAR Models with Machine Learning Algorithms [J].
Chen, Hongming ;
Carlsson, Lars ;
Eriksson, Mats ;
Varkonyi, Peter ;
Norinder, Ulf ;
Nilsson, Ingemar .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (06) :1324-1336
[6]   DPRESS: Localizing estimates of predictive uncertainty [J].
Clark, Robert D. .
JOURNAL OF CHEMINFORMATICS, 2009, 1
[7]  
Devetyarov D, 2010, IFIP ADV INF COMM TE, V339, P37
[8]   A stepwise approach for defining the applicability domain of SAR and QSAR models [J].
Dimitrov, S ;
Dimitrova, G ;
Pavlov, T ;
Dimitrova, N ;
Patlewicz, G ;
Niemela, J ;
Mekenyan, O .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (04) :839-849
[9]  
Eklund M., 2012, IFIP INT C ART INT A, P166
[10]   The application of conformal prediction to the drug discovery process [J].
Eklund, Martin ;
Norinder, Ulf ;
Boyer, Scott ;
Carlsson, Lars .
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2015, 74 (1-2) :117-132