Model Selection Using K-Means Clustering Algorithm for the Symmetrical Segmentation of Remote Sensing Datasets

被引:16
|
作者
Ali, Ishfaq [1 ]
Rehman, Atiq Ur [2 ]
Khan, Dost Muhammad [1 ]
Khan, Zardad [1 ]
Shafiq, Muhammad [3 ]
Choi, Jin-Ghoo [3 ]
机构
[1] Abdul Wali Khan Univ, Dept Stat, Mardan 23200, Pakistan
[2] Int Islam Univ, Fac Basic & Appl Sci, Dept Math & Stat, Islamabad 44000, Pakistan
[3] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan 38541, South Korea
来源
SYMMETRY-BASEL | 2022年 / 14卷 / 06期
基金
新加坡国家研究基金会;
关键词
unsupervised clustering; k-means; balanced optimal number of clusters; symmetry; clustering validity indices; remote sensing; root mean square error; satellite images; BIG DATA; DATA SET; NUMBER;
D O I
10.3390/sym14061149
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The importance of unsupervised clustering methods is well established in the statistics and machine learning literature. Many sophisticated unsupervised classification techniques have been made available to deal with a growing number of datasets. Due to its simplicity and efficiency in clustering a large dataset, the k-means clustering algorithm is still popular and widely used in the machine learning community. However, as with other clustering methods, it requires one to choose the balanced number of clusters in advance. This paper's primary emphasis is to develop a novel method for finding the optimum number of clusters, k, using a data-driven approach. Taking into account the cluster symmetry property, the k-means algorithm is applied multiple times to a range of k values within which the balanced optimum k value is expected. This is based on the uniqueness and symmetrical nature among the centroid values for the clusters produced, and we chose the final k value as the one for which symmetry is observed. We evaluated the proposed algorithm's performance on different simulated datasets with controlled parameters and also on real datasets taken from the UCI machine learning repository. We also evaluated the performance of the proposed method with the aim of remote sensing, such as in deforestation and urbanization, using satellite images of the Islamabad region in Pakistan, taken from the Sentinel-2B satellite of the United States Geological Survey. From the experimental results and real data analysis, it is concluded that the proposed algorithm has better accuracy and minimum root mean square error than the existing methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Initialization methods for remote sensing image clustering using K-means algorithm
    Zhong Y.-F.
    Zhang L.-P.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2010, 32 (09): : 2009 - 2014
  • [2] TABULAR K-MEANS CLUSTERING ON REMOTE SENSING IMAGES
    Tsai, Victor J. D.
    Tsui, C. K.
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 6967 - 6970
  • [3] Degrees of freedom and model selection for k-means clustering
    Hofmeyr, David P.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 149 (149)
  • [4] Parallel K-Means Clustering of Remote Sensing Images Based on MapReduce
    Lv, Zhenhua
    Hu, Yingjie
    Zhong, Haidong
    Wu, Jianping
    Li, Bo
    Zhao, Hui
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 162 - +
  • [5] A SPA-BASED K-MEANS CLUSTERING ALGORITHM FOR THE REMOTE SENSING INFORMATION EXTRACTION
    Xie, Xiangjian
    Zhao, Junsan
    Li, Hongbo
    Zhang, Wanqiang
    Yuan, Lei
    2012 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2012, : 6111 - 6114
  • [6] SELECTION OF INITIAL PARAMETERS OF K-MEANS CLUSTERING ALGORITHM FOR MRI BRAIN IMAGE SEGMENTATION
    Liu, Jian-Wei
    Guo, Lei
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL. 1, 2015, : 123 - 127
  • [7] Research on User Segmentation based on RFL Model and K-means Clustering Algorithm
    Chen, Yunpeng
    Liu, Ziyu
    Wang, Yan
    Qin, Yao
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON LOGISTICS, ENGINEERING, MANAGEMENT AND COMPUTER SCIENCE (LEMCS 2015), 2015, 117 : 1499 - 1503
  • [8] Unsupervised K-Means Clustering Algorithm
    Sinaga, Kristina P.
    Yang, Miin-Shen
    IEEE ACCESS, 2020, 8 : 80716 - 80727
  • [9] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [10] K-means clustering algorithm using the entropy
    Palubinskas, G
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING IV, 1998, 3500 : 63 - 71