Accurate and Precise Prediction of Soil Properties from a Large Mid-Infrared Spectral Library

被引:103
作者
Dangal, Shree R. S. [1 ]
Sanderman, Jonathan [1 ]
Wills, Skye [2 ]
Ramirez-Lopez, Leonardo [3 ]
机构
[1] Woods Hole Res Ctr, 149 Woods Hole Rd, Falmouth, MA 02540 USA
[2] USDA, NRCS, 100 Centennial Mall North, Lincoln, NE 68508 USA
[3] Labortechnik AG, BUCHI, NIR Data Analyt, Meierseggstr 40, CH-9230 Flawil, Switzerland
基金
美国食品与农业研究所;
关键词
local model; partial least squares regression; random forest; Cubist; MIR spectral library; prediction uncertainty; PARTIAL LEAST-SQUARES; DIFFUSE-REFLECTANCE SPECTROSCOPY; NEAR-INFRARED SPECTROSCOPY; ARTIFICIAL NEURAL-NETWORK; ORGANIC-CARBON; MOISTURE-CONTENT; AUSTRALIAN SOIL; TOTAL NITROGEN; REGRESSION; NIR;
D O I
10.3390/soilsystems3010011
中图分类号
S15 [土壤学];
学科分类号
0903 ; 090301 ;
摘要
Diffuse reflectance spectroscopy (DRS) is emerging as a rapid and cost-effective alternative to routine laboratory analysis for many soil properties. However, it has primarily been applied in project-specific contexts. Here, we provide an assessment of DRS spectroscopy at the scale of the continental United States by utilizing the large (n > 50,000) USDA National Soil Survey Center mid-infrared spectral library and associated soil characterization database. We tested and optimized several advanced statistical approaches for providing routine predictions of numerous soil properties relevant to studying carbon cycling. On independent validation sets, the machine learning algorithms Cubist and memory-based learner (MBL) both outperformed random forest (RF) and partial least squares regressions (PLSR) and produced excellent overall models with a mean R-2 of 0.92 (mean ratio of performance to deviation = 6.5) across all 10 soil properties. We found that the use of root-mean-square error (RMSE) was misleading for understanding the actual uncertainty about any particular prediction; therefore, we developed routines to assess the prediction uncertainty for all models except Cubist. The MBL models produced much more precise predictions compared with global PLSR and RF. Finally, we present several techniques that can be used to flag predictions of new samples that may not be reliable because their spectra fall outside of the calibration set.
引用
收藏
页码:1 / 23
页数:23
相关论文
共 90 条
[1]   Partial least squares regression and projection on latent structure regression (PLS Regression) [J].
Abdi, Herve .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (01) :97-106
[2]  
[Anonymous], 2014, 42 USDA SOIL SURV ST
[3]   Predicting contents of carbon and its component fractions in Australian soils from diffuse reflectance mid-infrared spectra [J].
Baldock, J. A. ;
Hawke, B. ;
Sanderman, J. ;
Macdonald, L. M. .
SOIL RESEARCH, 2013, 51 (7-8) :577-583
[4]   Near-infrared (NIR) and mid-infrared (MIR) spectroscopic techniques for assessing the amount of carbon stock in soils - Critical review and research perspectives [J].
Bellon-Maurel, Veronique ;
McBratney, Alex .
SOIL BIOLOGY & BIOCHEMISTRY, 2011, 43 (07) :1398-1410
[5]   Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy [J].
Bellon-Maurel, Veronique ;
Fernandez-Ahumada, Elvira ;
Palagos, Bernard ;
Roger, Jean-Michel ;
McBratney, Alex .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2010, 29 (09) :1073-1081
[6]  
Blake G. R., 1986, Methods of soil analysis. Part 1. Physical and mineralogical methods, P363
[7]   A comparison of methods for estimating prediction intervals in NIR spectroscopy: Size matters [J].
Bouckaert, Remco R. ;
Frank, Eibe ;
Holmes, Geoffrey ;
Fletcher, Dale .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 109 (02) :139-145
[8]  
Bradford MA, 2016, NAT CLIM CHANGE, V6, P751, DOI [10.1038/nclimate3071, 10.1038/NCLIMATE3071]
[9]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[10]  
Breiman L., 2017, Classification and regression trees, DOI [DOI 10.1201/9781315139470-8, 10.1201/9781315139470-8]