Evaluating validation strategies on the performance of soil property prediction from regional to continental spectral data

被引:51
作者
Chen, Songchao [1 ,2 ]
Xu, Hanyi [1 ]
Xu, Dongyun [1 ]
Ji, Wenjun [3 ]
Li, Shuo [4 ]
Yang, Meihua [5 ]
Hu, Bifeng [6 ]
Zhou, Yin [1 ,7 ]
Wang, Nan [1 ]
Arrouays, Dominique [2 ]
Shi, Zhou [1 ,8 ]
机构
[1] Zhejiang Univ, Coll Environm & Resource Sci, Inst Appl Remote Sensing & Informat Technol, Hangzhou 310058, Peoples R China
[2] INRAE, Unite InfoSol, F-45075 Orleans, France
[3] China Agr Univ, Coll Land Sci & Technol, Beijing 100085, Peoples R China
[4] Cent China Normal Univ, Key Lab Geog Proc Anal & Simulat, Wuhan 430079, Peoples R China
[5] Yuzhang Normal Univ, Dept Environm Engn, Nanchang 330103, Jiangxi, Peoples R China
[6] Jiangxi Univ Finance & Econ, Sch Tourism & Urban Management, Dept Land Resource Management, Nanchang 330013, Jiangxi, Peoples R China
[7] Zhejiang Univ, Sch Publ Affairs, Inst Land Sci & Property Management, Hangzhou 310058, Peoples R China
[8] Minist Agr & Rural Affairs, Key Lab Spect Sensing, Hangzhou 310058, Peoples R China
基金
中国国家自然科学基金;
关键词
Proximal soil sensing; Vis-NIR spectra; Model robustness; Soil organic carbon; Clay; Calibration sampling; NEAR-INFRARED SPECTROSCOPY; ORGANIC-CARBON CONTENT; NIR SPECTROSCOPY; MIDINFRARED SPECTROSCOPY; REFLECTANCE SPECTRA; SAMPLE SELECTION; CALIBRATION SET; NEURAL-NETWORK; MODEL; MOISTURE;
D O I
10.1016/j.geoderma.2021.115159
中图分类号
S15 [土壤学];
学科分类号
0903 ; 090301 ;
摘要
Visible-near infrared (vis-NIR) spectroscopy has been widely used to characterize soil information from field to global scales. Before applying a calibrated spectral predictive model to acquire soil information, either independent validation or k-fold cross validation is used to evaluate model performance. However, there is no consensus on which validation strategy is more suitable and robust when evaluating model performance for the studies in different scales. The objective of this study is to evaluate and compare the model performance of two validation strategies coupling different calibration sizes (a ratio of calibration to validation of 2:1, 4:1 and 9:1) and calibration sampling strategies (random sampling (RS), rank, Kennard-Stone (KS), rank-Kennard-Stone (RKS) and conditioned Latin hypercube sampling (cLHS)) across scales. A total of 17,272 vis-NIR spectra of mineral soils from LUCAS data (continental scale) and their soil organic carbon (SOC) and clay contents were used in this study, and the dataset was further split into national (2761 samples in France) and five regional datasets (110 to 248 samples from five French administrative regions). To eliminate the effect of changing validation set on the model performance, a consistent test set (20% of total samples at each scale) was split to evaluate all the combinations involved in two validation strategies. The Lin's concordance correlation coefficient (CCC) of the cubist model were stable for both SOC and clay for different calibration sizes, calibration sampling and validation strategies for a large calibration size (>1400) at the national and continental scales. A larger calibration size can potentially improve model performance for a small dataset (<300) at the regional scale, and a wider calibration range would result in better model performance. No silver bullet was found among the different calibration sampling strategies at the regional scale. For five French regions (small data set), we found a high variation (95th percentile minus the 5th percentile) in the CCC among the models built from 50 repeated RS (0.10-0.44 for SOC, 0.16-0.52 for clay) and cLHS (0.08-0.40 for SOC, 0.12-0.36 for clay). This finding indicates that a one-time RS or cLHS for selecting the calibration set has high uncertainty in model evaluation for a small dataset and therefore should be used with caution. Therefore, we suggest the following: (1) for a large data set (thousands), either one-time random sampling for independent validation or k-fold cross validation would be appropriate; (2) for a small data set (dozens to hundreds), k-fold cross validation and/or repeated random sampling for independent validation would be more robust for spectral predictive model evaluation.
引用
收藏
页数:10
相关论文
共 70 条
[1]   Soil Organic Carbon Prediction by Vis-NIR Spectroscopy: Case Study the Kur-Aras Plain, Azerbaijan [J].
Amin, Ismayilov ;
Fikrat, Feyziyev ;
Mammadov, Elton ;
Babayev, Maharram .
COMMUNICATIONS IN SOIL SCIENCE AND PLANT ANALYSIS, 2020, 51 (06) :726-734
[2]  
[Anonymous], 2016, SOIL GERMANY
[3]   GlobalSoilMap: Toward a Fine-Resolution Global Grid of Soil Properties [J].
Arrouays, Dominique ;
Grundy, Michael G. ;
Hartemink, Alfred E. ;
Hempel, Jonathan W. ;
Heuvelink, Gerard B. M. ;
Hong, S. Young ;
Lagacherie, Philippe ;
Lelyk, Glenn ;
McBratney, Alexander B. ;
McKenzie, Neil J. ;
Mendonca-Santos, Maria D. L. ;
Minasny, Budiman ;
Montanarella, Luca ;
Odeh, Inakwu O. A. ;
Sanchez, Pedro A. ;
Thompson, James A. ;
Zhang, Gan-Lin .
ADVANCES IN AGRONOMY, VOL 125, 2014, 125 :93-+
[4]   The challenge for the soil science community to contribute to the implementation of the UN Sustainable Development Goals [J].
Bouma, Johan ;
Montanarella, Luca ;
Evanylo, Gregory .
SOIL USE AND MANAGEMENT, 2019, 35 (04) :538-546
[5]   Sampling Strategies for Soil Property Mapping Using Multispectral Sentinel-2 and Hyperspectral EnMAP Satellite Data [J].
Castaldi, Fabio ;
Chabrillat, Sabine ;
van Wesemael, Bas .
REMOTE SENSING, 2019, 11 (03)
[6]   Near-infrared reflectance spectroscopic analysis of soil C and N [J].
Chang, CW ;
Laird, DA .
SOIL SCIENCE, 2002, 167 (02) :110-116
[7]   Rapid determination of soil classes in soil profiles using vis-NIR spectroscopy and multiple objectives mixed support vector classification [J].
Chen, S. ;
Li, S. ;
Ma, W. ;
Ji, W. ;
Xu, D. ;
Shi, Z. ;
Zhang, G. .
EUROPEAN JOURNAL OF SOIL SCIENCE, 2019, 70 (01) :42-53
[8]   Study on the Characterization of VNIR-MIR Spectra and Prediction of Soil Organic Matter in Paddy Soil [J].
Chen Song-chao ;
Peng Jie ;
Ji Wen-jun ;
Zhou Yin ;
He Ji-xiu ;
Shi Zhou .
SPECTROSCOPY AND SPECTRAL ANALYSIS, 2016, 36 (06) :1712-1716
[9]   Monitoring soil organic carbon in alpine soils using in situ vis-NIR spectroscopy and a multilayer perceptron [J].
Chen, Songchao ;
Xu, Dongyun ;
Li, Shuo ;
Ji, Wenjun ;
Yang, Meihua ;
Zhou, Yin ;
Hu, Bifeng ;
Xu, Hanyi ;
Shi, Zhou .
LAND DEGRADATION & DEVELOPMENT, 2020, 31 (08) :1026-1038
[10]   Fine resolution map of top- and subsoil carbon sequestration potential in France [J].
Chen, Songchao ;
Martin, Manuel P. ;
Saby, Nicolas P. A. ;
Walter, Christian ;
Angers, Denis A. ;
Arrouays, Dominique .
SCIENCE OF THE TOTAL ENVIRONMENT, 2018, 630 :389-400