Approximating Prediction Uncertainty for Random Forest Regression Models

被引:98
作者
Coulston, John W. [1 ]
Blinn, Christine E. [2 ]
Thomas, Valerie A. [2 ]
Wynne, Randolph H. [2 ]
机构
[1] USDA Forest Serv, Southern Res Stn, Blacksburg, VA USA
[2] Virginia Polytech Inst & State Univ, Dept Forest Resources & Environm Conservat, Blacksburg, VA 24061 USA
关键词
MACHINE LEARNING ALGORITHMS; LAND-COVER DATABASE; BIOMASS;
D O I
10.14358/PERS.82.3.189
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as inputs to other modeling applications such as fire modeling. Here we use a Monte Carlo approach to quantify prediction uncertainty for random forest regression models. We test the approach by simulating maps of dependent and independent variables with known characteristics and comparing actual errors with prediction errors. Our approach produced conservative prediction intervals across most of the range of predicted values. However, because the Monte Carlo approach was data driven, prediction intervals were either too wide or too narrow in sparse parts of the prediction distribution. Overall, our approach provides reasonable estimates of prediction uncertainty for random forest regression models.
引用
收藏
页码:189 / 197
页数:9
相关论文
共 32 条
[1]  
[Anonymous], 1985, Applied Linear Regression, DOI DOI 10.1002/BIMJ.4710300746
[2]   Scaling field data to calibrate and validate moderate spatial resolution remote sensing models [J].
Baccini, A. ;
Friedl, M. A. ;
Woodcock, C. E. ;
Zhu, Z. .
PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2007, 73 (08) :945-954
[3]  
Baccini A., 2008, ENVIRON RES LETT, V3, P9
[4]   Mapping US forest biomass using nationwide forest inventory data and moderate resolution information [J].
Blackard, J. A. ;
Finco, M. V. ;
Helmer, E. H. ;
Holden, G. R. ;
Hoppus, M. L. ;
Jacobs, D. M. ;
Lister, A. J. ;
Moisen, G. G. ;
Nelson, M. D. ;
Riemann, R. ;
Ruefenacht, B. ;
Salajanu, D. ;
Weyermann, D. L. ;
Winterberger, K. C. ;
Brandeis, T. J. ;
Czaplewski, R. L. ;
McRoberts, R. E. ;
Patterson, P. L. ;
Tymcio, R. P. .
REMOTE SENSING OF ENVIRONMENT, 2008, 112 (04) :1658-1677
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]  
Congalton R.G., 2008, Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, DOI DOI 10.1201/9781420055139
[7]   Modeling Percent Tree Canopy Cover: A Pilot Study [J].
Coulston, John W. ;
Moisen, Gretchen G. ;
Wilson, Barry T. ;
Finco, Mark V. ;
Cohen, Warren B. ;
Brewer, C. Kenneth .
PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2012, 78 (07) :715-727
[8]   Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information [J].
Cracknell, Matthew J. ;
Reading, Anya M. .
COMPUTERS & GEOSCIENCES, 2014, 63 :22-33
[9]  
Draper N., 1981, Applied Regression Analysis, V7th
[10]  
Dungan JL, 2003, INT GEOSCI REMOTE SE, P3017