Multinomial Logistic Regression and Random Forest Classifiers in Digital Mapping of Soil Classes in Western Haiti

被引:15
|
作者
Jeune, Wesly [1 ]
Francelino, Marcio Rocha [2 ]
de Souza, Eliana [3 ]
Fernandes Filho, Elpidio Inacio [2 ]
Rocha, Genelicio Crusoe [2 ]
机构
[1] Univ Quisqueya, Fac Sci Agr & Environm, Port Au Prince, Ouest, Haiti
[2] Univ Fed Vicosa, Dept Solos, Vicosa, MG, Brazil
[3] Univ Fed Vicosa, Dept Solos, Programa Posgrad Solos & Nutr Plantas, Vicosa, MG, Brazil
来源
REVISTA BRASILEIRA DE CIENCIA DO SOLO | 2018年 / 42卷
关键词
auxiliary data; digital soil mapping; soil survey; data-mining; CLASSIFICATION; MAP; CLIMATE; STATE;
D O I
10.1590/18069657rbcs20170133
中图分类号
S15 [土壤学];
学科分类号
0903 ; 090301 ;
摘要
Digital soil mapping (DSM) has been increasingly used to provide quick and accurate spatial information to support decision-makers in agricultural and environmental planning programs. In this study, we used a DSM approach to map soils in western Haiti and compare the performance of the Multinomial Logistic Regression (MLR) with Random Forest (RF) to classify the soils. The study area of 4,300 km(2) is mostly composed of diverse limestone rocks, alluvial deposits, and, to a lesser extent, basalt. A soil survey was conducted whereby soils were described and classified at 258 sites. Soil samples were collected and subjected to physical and chemical analyses. Recursive Feature Elimination (RFE) was used to select the most important covariates from auxiliary data, such as climate, lithology, and morphometric properties to describe the soil-landscape relationship. Mapping performance was assessed by the Kappa index and overall accuracy derived from a confusion matrix generated using a 5-fold cross validation process. In addition, an external mapping validation was carried out using an independent soil dataset. Accordingly, the soil dataset was split into 80 % and 20 % for training and validation of the models, respectively. No significant statistical difference (Z = 0.56< |1.96|) was found between maps generated with both classifiers (Kappa index 0.45 for MLR and 0.42 for RF). Based on the Kappa values, the classification performance can be characterized as moderate for both algorithms. Surprisingly, the RF classifier outperformed MLR in the validation process (Kappa values of 0.55 and 0.33, respectively). These results suggest a higher generalization ability of RF. However, no significant statistical difference (Z = 1.83< |1.96|) was observed. The soil map derived from RF indicated the occurrence of Leptosols (48.5 %), Gleysols (19.6 %), Chernozems (8 %), and Fluvisols (6.6 %) in most of the study area. The DSM approaches proved suitable for mapping soils in western Haiti and could be used in other parts of the country, thereby closing information gaps with regard to Haitian soils.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Evaluation of Random Forest in Crime Prediction: Comparing Three-Layered Random Forest and Logistic Regression
    Oh, Gyeongseok
    Song, Juyoung
    Park, Hyoungah
    Na, Chongmin
    DEVIANT BEHAVIOR, 2022, 43 (09) : 1036 - 1049
  • [22] Updating the 1:50,000 Dutch soil map using legacy soil data: A multinomial logistic regression approach
    Kempen, Bas
    Brus, Dick J.
    Heuvelink, Gerard B. M.
    Stoorvogel, Jetse J.
    GEODERMA, 2009, 151 (3-4) : 311 - 326
  • [23] Predictive soil parent material mapping at a regional-scale: A Random Forest approach
    Heung, Brandon
    Bulmer, Chuck E.
    Schmidt, Margaret G.
    GEODERMA, 2014, 214 : 141 - 154
  • [24] Digital mapping of the global soil δ15N at 0.1° x 0.1° resolution using random forest regression with climate classification
    Zan, Qilin
    Lai, Xiaoming
    Zhu, Qing
    Li, Liuyang
    Liao, Kaihua
    ECOLOGICAL INDICATORS, 2023, 155
  • [25] Addressing the issue of digital mapping of soil classes with imbalanced class observations
    Sharififar, Amin
    Sarmadian, Fereydoon
    Malone, Brendan P.
    Minasny, Budiman
    GEODERMA, 2019, 350 : 84 - 92
  • [26] Diabetes Prediction using Decision Tree, Random Forest, Support Vector Machine, K- Nearest Neighbors, Logistic Regression Classifiers
    Peerbasha, S.
    Raja, A. Saleem
    Praveen, K. P.
    Iqbal, Y. Mohammed
    Surputheen, Mohamed
    JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 2023, 5 (04): : 42 - 54
  • [27] A Novel Performance Assessment Approach Using Photogrammetric Techniques for Landslide Susceptibility Mapping with Logistic Regression, ANN and Random Forest
    Sevgen, Eray
    Kocaman, Sultan
    Nefeslioglu, Hakan A.
    Gokceoglu, Candan
    SENSORS, 2019, 19 (18)
  • [28] Digital Mapping of Soil Particle Size Fractions in the Loess Plateau, China, Using Environmental Variables and Multivariate Random Forest
    He, Wenjie
    Xiao, Zhiwei
    Lu, Qikai
    Wei, Lifei
    Liu, Xing
    REMOTE SENSING, 2024, 16 (05)
  • [29] Bottom-up digital soil mapping. II. Soil series classes
    Odgers, Nathan P.
    McBratney, Alex. B.
    Minasny, Budiman
    GEODERMA, 2011, 163 (1-2) : 30 - 37
  • [30] Bottom-up digital soil mapping. I. Soil layer classes
    Odgers, Nathan P.
    McBratney, Alex. B.
    Minasny, Budiman
    GEODERMA, 2011, 163 (1-2) : 38 - 44