Model Selection Using K-Means Clustering Algorithm for the Symmetrical Segmentation of Remote Sensing Datasets

被引:16
|
作者
Ali, Ishfaq [1 ]
Rehman, Atiq Ur [2 ]
Khan, Dost Muhammad [1 ]
Khan, Zardad [1 ]
Shafiq, Muhammad [3 ]
Choi, Jin-Ghoo [3 ]
机构
[1] Abdul Wali Khan Univ, Dept Stat, Mardan 23200, Pakistan
[2] Int Islam Univ, Fac Basic & Appl Sci, Dept Math & Stat, Islamabad 44000, Pakistan
[3] Yeungnam Univ, Dept Informat & Commun Engn, Gyongsan 38541, South Korea
来源
SYMMETRY-BASEL | 2022年 / 14卷 / 06期
基金
新加坡国家研究基金会;
关键词
unsupervised clustering; k-means; balanced optimal number of clusters; symmetry; clustering validity indices; remote sensing; root mean square error; satellite images; BIG DATA; DATA SET; NUMBER;
D O I
10.3390/sym14061149
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The importance of unsupervised clustering methods is well established in the statistics and machine learning literature. Many sophisticated unsupervised classification techniques have been made available to deal with a growing number of datasets. Due to its simplicity and efficiency in clustering a large dataset, the k-means clustering algorithm is still popular and widely used in the machine learning community. However, as with other clustering methods, it requires one to choose the balanced number of clusters in advance. This paper's primary emphasis is to develop a novel method for finding the optimum number of clusters, k, using a data-driven approach. Taking into account the cluster symmetry property, the k-means algorithm is applied multiple times to a range of k values within which the balanced optimum k value is expected. This is based on the uniqueness and symmetrical nature among the centroid values for the clusters produced, and we chose the final k value as the one for which symmetry is observed. We evaluated the proposed algorithm's performance on different simulated datasets with controlled parameters and also on real datasets taken from the UCI machine learning repository. We also evaluated the performance of the proposed method with the aim of remote sensing, such as in deforestation and urbanization, using satellite images of the Islamabad region in Pakistan, taken from the Sentinel-2B satellite of the United States Geological Survey. From the experimental results and real data analysis, it is concluded that the proposed algorithm has better accuracy and minimum root mean square error than the existing methods.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Adaptive K-Means clustering algorithm
    Chen, Hailin
    Wu, Xiuqing
    Hu, Junhua
    MIPPR 2007: PATTERN RECOGNITION AND COMPUTER VISION, 2007, 6788
  • [32] An Enhancement of K-means Clustering Algorithm
    Gu, Jirong
    Zhou, Jieming
    Chen, Xianwei
    2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 237 - 240
  • [33] K-Means Clustering Efficient Algorithm with Initial Class Center Selection
    Huang Suyu
    Hu Pingfang
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2018), 2018, 78 : 301 - 305
  • [34] Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm
    Aamer, Youssef
    Benkaouz, Yahya
    Ouzzif, Mohammed
    Bouragba, Khalid
    UBIQUITOUS NETWORKING, UNET 2019, 2020, 12293 : 182 - 190
  • [35] A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach
    Ben Salem, Semeh
    Naouali, Sami
    Chtourou, Zied
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 68 : 463 - 483
  • [36] Evaluating the attributes of remote sensing image pixels for fast k-means clustering
    Saglam, Ali
    Baykan, Nurdan Akhan
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (06) : 4188 - 4202
  • [37] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249
  • [38] Road Region Segmentation of Remote Sensing Images Based on K-means and PCNN
    Yang, Xiaocui
    Meng Wanli
    PROCEEDINGS OF 2018 THE 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING (ICMIP 2018), 2018, : 36 - 40
  • [39] A MapReduce-based K-means clustering algorithm
    Mao, YiMin
    Gan, DeJin
    Mwakapesa, D. S.
    Nanehkaran, Y. A.
    Tao, Tao
    Huang, XueYu
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (04) : 5181 - 5202
  • [40] An active contour model driven by K-means clustering for image segmentation
    Ge, Pengqiang
    Chen, Yiyang
    Wang, Guina
    Weng, Guirong
    Chen, Hongtian
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4595 - 4600