Model Selection Using K-Means Clustering Algorithm for the Symmetrical Segmentation of Remote Sensing Datasets

被引:16
|
作者
Ali, Ishfaq [1 ]
Rehman, Atiq Ur [2 ]
Khan, Dost Muhammad [1 ]
Khan, Zardad [1 ]
Shafiq, Muhammad [3 ]
Choi, Jin-Ghoo [3 ]
机构
[1] Abdul Wali Khan Univ, Dept Stat, Mardan 23200, Pakistan
[2] Int Islam Univ, Fac Basic & Appl Sci, Dept Math & Stat, Islamabad 44000, Pakistan
[3] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan 38541, South Korea
来源
SYMMETRY-BASEL | 2022年 / 14卷 / 06期
基金
新加坡国家研究基金会;
关键词
unsupervised clustering; k-means; balanced optimal number of clusters; symmetry; clustering validity indices; remote sensing; root mean square error; satellite images; BIG DATA; DATA SET; NUMBER;
D O I
10.3390/sym14061149
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The importance of unsupervised clustering methods is well established in the statistics and machine learning literature. Many sophisticated unsupervised classification techniques have been made available to deal with a growing number of datasets. Due to its simplicity and efficiency in clustering a large dataset, the k-means clustering algorithm is still popular and widely used in the machine learning community. However, as with other clustering methods, it requires one to choose the balanced number of clusters in advance. This paper's primary emphasis is to develop a novel method for finding the optimum number of clusters, k, using a data-driven approach. Taking into account the cluster symmetry property, the k-means algorithm is applied multiple times to a range of k values within which the balanced optimum k value is expected. This is based on the uniqueness and symmetrical nature among the centroid values for the clusters produced, and we chose the final k value as the one for which symmetry is observed. We evaluated the proposed algorithm's performance on different simulated datasets with controlled parameters and also on real datasets taken from the UCI machine learning repository. We also evaluated the performance of the proposed method with the aim of remote sensing, such as in deforestation and urbanization, using satellite images of the Islamabad region in Pakistan, taken from the Sentinel-2B satellite of the United States Geological Survey. From the experimental results and real data analysis, it is concluded that the proposed algorithm has better accuracy and minimum root mean square error than the existing methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Discovering Knowledge by Comparing Silhouettes Using K-Means Clustering for Customer Segmentation
    Akbar, Zeeshan
    Liu, Jun
    Latif, Zahida
    INTERNATIONAL JOURNAL OF KNOWLEDGE MANAGEMENT, 2020, 16 (03) : 70 - 88
  • [22] The MinMax k-Means clustering algorithm
    Tzortzis, Grigorios
    Likas, Aristidis
    PATTERN RECOGNITION, 2014, 47 (07) : 2505 - 2516
  • [23] Improved K-means clustering algorithm
    Zhang, Zhe
    Zhang, Junxi
    Xue, Huifeng
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 5, PROCEEDINGS, 2008, : 169 - 172
  • [24] A k-means based clustering algorithm
    Bloisi, Domenico Daniele
    Locchi, Luca
    COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
  • [25] Modified k-Means Clustering Algorithm
    Patel, Vaishali R.
    Mehta, Rupa G.
    COMPUTATIONAL INTELLIGENCE AND INFORMATION TECHNOLOGY, 2011, 250 : 307 - +
  • [26] K-means properties on six clustering benchmark datasets
    Franti, Pasi
    Sieranoja, Sami
    APPLIED INTELLIGENCE, 2018, 48 (12) : 4743 - 4759
  • [27] K-means properties on six clustering benchmark datasets
    Pasi Fränti
    Sami Sieranoja
    Applied Intelligence, 2018, 48 : 4743 - 4759
  • [28] MapReduce Model of Improved K-Means Clustering Algorithm Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Ahmad, Shahbaaz
    2016 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2016, : 192 - 198
  • [29] An improved K-means clustering algorithm
    Huang, Xiuchang
    Su, Wei
    Journal of Networks, 2014, 9 (01) : 161 - 167
  • [30] Improved Algorithm for the k-means Clustering
    Zhang, Sheng
    Wang, Shouqiang
    PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 4717 - 4720