Regression conformal prediction with random forests

被引:5
作者
Ulf Johansson
Henrik Boström
Tuve Löfström
Henrik Linusson
机构
[1] University of Borås,School of Business and IT
[2] Stockholm University,Department of Computer and Systems Sciences
来源
Machine Learning | 2014年 / 97卷
关键词
Conformal prediction; Random forests; Regression;
D O I
暂无
中图分类号
学科分类号
摘要
Regression conformal prediction produces prediction intervals that are valid, i.e., the probability of excluding the correct target value is bounded by a predefined confidence level. The most important criterion when comparing conformal regressors is efficiency; the prediction intervals should be as tight (informative) as possible. In this study, the use of random forests as the underlying model for regression conformal prediction is investigated and compared to existing state-of-the-art techniques, which are based on neural networks and k-nearest neighbors. In addition to their robust predictive performance, random forests allow for determining the size of the prediction intervals by using out-of-bag estimates instead of requiring a separate calibration set. An extensive empirical investigation, using 33 publicly available data sets, was undertaken to compare the use of random forests to existing state-of-the-art conformal predictors. The results show that the suggested approach, on almost all confidence levels and using both standard and normalized nonconformity functions, produced significantly more efficient conformal predictors than the existing alternatives.
引用
收藏
页码:155 / 176
页数:21
相关论文
共 45 条
[1]  
Alcalá-Fdez J(2011)Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework Multiple-Valued Logic and Soft Computing 17 255-287
[2]  
Fernández A(1996)Bagging predictors Machine Learning 24 123-140
[3]  
Luengo J(2001)Random forests Machine Learning 45 5-32
[4]  
Derrac J(2010)Prediction with confidence based on a random forest classifier Artificial Intelligence Applications and Innovations 7 37-44
[5]  
García S(2012)Conformal predictors in early diagnostics of ovarian and breast cancers Progress in Artificial Intelligence 1 245-257
[6]  
Breiman L(2002)Efficient svm regression training with smo Machine Learning 46 271-290
[7]  
Breiman L(1937)The use of ranks to avoid the assumption of normality implicit in the analysis of variance Journal of American Statistical Association 32 675-701
[8]  
Devetyarov D(2008)An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons Journal of Machine Learning Research 9 66-99
[9]  
Nouretdinov I(2011)Reliable confidence measures for medical diagnosis with evolutionary algorithms IEEE Transactions on Information Technology in Biomedicine 15 93-1216
[10]  
Devetyarov D(2011)Computationally efficient svm multi-class image recognition with confidence measures Fusion Engineering and Design 86 1213-223