Multinomial Logistic Regression and Random Forest Classifiers in Digital Mapping of Soil Classes in Western Haiti

被引:15
作者
Jeune, Wesly [1 ]
Francelino, Marcio Rocha [2 ]
de Souza, Eliana [3 ]
Fernandes Filho, Elpidio Inacio [2 ]
Rocha, Genelicio Crusoe [2 ]
机构
[1] Univ Quisqueya, Fac Sci Agr & Environm, Port Au Prince, Ouest, Haiti
[2] Univ Fed Vicosa, Dept Solos, Vicosa, MG, Brazil
[3] Univ Fed Vicosa, Dept Solos, Programa Posgrad Solos & Nutr Plantas, Vicosa, MG, Brazil
来源
REVISTA BRASILEIRA DE CIENCIA DO SOLO | 2018年 / 42卷
关键词
auxiliary data; digital soil mapping; soil survey; data-mining; CLASSIFICATION; MAP; CLIMATE; STATE;
D O I
10.1590/18069657rbcs20170133
中图分类号
S15 [土壤学];
学科分类号
0903 ; 090301 ;
摘要
Digital soil mapping (DSM) has been increasingly used to provide quick and accurate spatial information to support decision-makers in agricultural and environmental planning programs. In this study, we used a DSM approach to map soils in western Haiti and compare the performance of the Multinomial Logistic Regression (MLR) with Random Forest (RF) to classify the soils. The study area of 4,300 km(2) is mostly composed of diverse limestone rocks, alluvial deposits, and, to a lesser extent, basalt. A soil survey was conducted whereby soils were described and classified at 258 sites. Soil samples were collected and subjected to physical and chemical analyses. Recursive Feature Elimination (RFE) was used to select the most important covariates from auxiliary data, such as climate, lithology, and morphometric properties to describe the soil-landscape relationship. Mapping performance was assessed by the Kappa index and overall accuracy derived from a confusion matrix generated using a 5-fold cross validation process. In addition, an external mapping validation was carried out using an independent soil dataset. Accordingly, the soil dataset was split into 80 % and 20 % for training and validation of the models, respectively. No significant statistical difference (Z = 0.56< |1.96|) was found between maps generated with both classifiers (Kappa index 0.45 for MLR and 0.42 for RF). Based on the Kappa values, the classification performance can be characterized as moderate for both algorithms. Surprisingly, the RF classifier outperformed MLR in the validation process (Kappa values of 0.55 and 0.33, respectively). These results suggest a higher generalization ability of RF. However, no significant statistical difference (Z = 1.83< |1.96|) was observed. The soil map derived from RF indicated the occurrence of Leptosols (48.5 %), Gleysols (19.6 %), Chernozems (8 %), and Fluvisols (6.6 %) in most of the study area. The DSM approaches proved suitable for mapping soils in western Haiti and could be used in other parts of the country, thereby closing information gaps with regard to Haitian soils.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Bottom-up digital soil mapping. II. Soil series classes
    Odgers, Nathan P.
    McBratney, Alex. B.
    Minasny, Budiman
    GEODERMA, 2011, 163 (1-2) : 30 - 37
  • [32] Bottom-up digital soil mapping. I. Soil layer classes
    Odgers, Nathan P.
    McBratney, Alex. B.
    Minasny, Budiman
    GEODERMA, 2011, 163 (1-2) : 38 - 44
  • [33] Digital Mapping of Soil Classes Using Ensemble of Models in Isfahan Region, Iran
    Taghizadeh-Mehrjardi, Ruhollah
    Minasny, Budiman
    Toomanian, Norair
    Zeraatpisheh, Mojtaba
    Amirian-Chakan, Alireza
    Triantafilis, John
    SOIL SYSTEMS, 2019, 3 (02) : 1 - 21
  • [34] Object-oriented mapping of urban trees using Random Forest classifiers
    Puissant, Anne
    Rougier, Simon
    Stumpf, Andre
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2014, 26 : 235 - 245
  • [35] Digital soil mapping of soil organic carbon stocks in Western Ghats, South India
    Dharumarajan, S.
    Kalaiselvi, B.
    Suputhra, Amar
    Lalitha, M.
    Vasundhara, R.
    Kumar, K. S. Anil
    Nair, K. M.
    Hegde, Rajendra
    Singh, S. K.
    Lagacherie, Philippe
    GEODERMA REGIONAL, 2021, 25
  • [36] Random forest versus logistic regression: a large-scale benchmark experiment
    Raphael Couronné
    Philipp Probst
    Anne-Laure Boulesteix
    BMC Bioinformatics, 19
  • [37] Random forest versus logistic regression: a large-scale benchmark experiment
    Couronne, Raphael
    Probst, Philipp
    Boulesteix, Anne-Laure
    BMC BIOINFORMATICS, 2018, 19
  • [38] The effect of covariates on Soil Organic Matter and pH variability: a digital soil mapping approach using random forest model
    Bouslihim, Yassine
    John, Kingsley
    Miftah, Abdelhalim
    Azmi, Rida
    Aboutayeb, Rachid
    Bouasria, Abdelkrim
    Razouk, Rachid
    Hssaini, Lahcen
    ANNALS OF GIS, 2024, 30 (02) : 215 - 232
  • [39] Digital soil mapping workflow for forest resource applications: a case study in the Hearst Forest, Ontario
    Blackford, Christopher
    Heung, Brandon
    Baldwin, Ken
    Fleming, Robert L.
    Hazlett, Paul W.
    Morris, Dave M.
    Uhlig, Peter W. C.
    Webster, Kara L.
    CANADIAN JOURNAL OF FOREST RESEARCH, 2021, 51 (01) : 59 - 77
  • [40] Principal components as predictor variables in digital mapping of soil classes
    ten Caten, Alexandre
    Diniz Dalmolin, Ricardo Simao
    Pedron, Fabricio de Araujo
    Mendonca Santos, Maria de Lourdes
    CIENCIA RURAL, 2011, 41 (07): : 1170 - 1176