A new approach to clustering soil profile data using the modified distance matrix

被引:5
作者
Shelia, Vakhtang [1 ,2 ]
Hoogenboom, Gerrit [1 ,2 ]
机构
[1] Univ Florida, Dept Agr & Biol Engn, Gainesville, FL 32611 USA
[2] Univ Florida, Inst Sustainable Food Syst, Gainesville, FL 32611 USA
关键词
Soil horizon; Numerical classification; Hierarchical algorithm; Ward method; WISE; FUZZY K-MEANS; CLASSIFICATION; ALGORITHMS;
D O I
10.1016/j.compag.2020.105631
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
The application of different data mining methods for large soil profile datasets can be very useful for many agricultural and natural resource management applications, ranging from crop modeling to soil taxonomy. Distance or dissimilarity measures are key features of these methods. The proximity measure or the distance between vectors is calculated when they have the same dimensions. In the case of the soil profile data, the corresponding matrices representing different soils and their horizon properties usually have different dimensions. The objectives of this study were to explore a new approach for creating a semi-metric based on adjustment of the soil profile horizons, implement it in the computer application and to apply the modified distance matrix calculation to maximize the use of soil horizon properties for soil data mining. We assume that each soil horizon is homogenous while a vertical heterogeneity of soil profile is expressed through different soil horizons. Therefore, any sublayers of a horizon are characterized with the same values for its attributes as the horizon itself. In our approach, we developed matrices with the same dimension for soil profiles and calculated the proximity measure. The algorithm was implemented as an easy to use Fortran application that can calculate the modified distance matrix (MDM) for low- and high-dimensional soil profiles data. The proposed approach was shown to be effective when using existing reliable datasets, such as WISE Version 3.1, a global soil profile database developed by ISRIC. Hierarchical clustering was performed using the MDM based on the original algorithm of soil profile horizons adjustment with further integration into R. The principal finding shows that a proposed modified distance matrix can be used with different clustering methods for soil profile data clustering on a horizon-by-horizon basis. This study established a new methodology for using the modified distance matrix calculation and applying it with different clustering algorithms to large sets of soil profile data obtained from detailed soil surveys.
引用
收藏
页数:12
相关论文
共 51 条
  • [1] Data outlier detection using the Chebyshev theorem
    Amidan, Brett G.
    Ferryman, Thomas A.
    Cooley, Scott K.
    [J]. 2005 IEEE Aerospace Conference, Vols 1-4, 2005, : 3814 - 3819
  • [2] [Anonymous], 2020, Soil Texture Calculator
  • [3] [Anonymous], 2001, Cluster analysis
  • [4] Bandyopadhyay S., 2013, UNSUPERVISED CLASSIF, P75
  • [5] Harmonized soil profile data for applications at global and continental scales: updates to the WISE database
    Batjes, N. H.
    [J]. SOIL USE AND MANAGEMENT, 2009, 25 (02) : 124 - 127
  • [6] Batjes N.H., 2008, Report2008/02
  • [7] Algorithms for quantitative pedology: A toolkit for soil scientists
    Beaudette, D. E.
    Roudier, P.
    O'Geen, A. T.
    [J]. COMPUTERS & GEOSCIENCES, 2013, 52 : 258 - 268
  • [8] Brady NC, 1999, NATURE PROPERTIES SO, DOI DOI 10.1023/A:1016012810895.3
  • [9] FUZZY CLASSIFICATION METHODS FOR DETERMINING LAND SUITABILITY FROM SOIL-PROFILE OBSERVATIONS AND TOPOGRAPHY
    BURROUGH, PA
    MACMILLAN, RA
    VANDEURSEN, W
    [J]. JOURNAL OF SOIL SCIENCE, 1992, 43 (02): : 193 - 210
  • [10] Evaluation of climate patterns in a regional climate model over Italy using long-term records from SYNOP weather stations and cluster analysis
    Calmanti, S.
    Dell'Aquila, A.
    Maimone, F.
    Pelino, V.
    [J]. CLIMATE RESEARCH, 2015, 62 (03) : 173 - 188