Data clustering: application and trends

被引:72
|
作者
Oyewole, Gbeminiyi John [1 ]
Thopil, George Alex [1 ]
机构
[1] Univ Pretoria, Dept Engn & Technol Management, Pretoria, South Africa
关键词
Clustering; Clustering classification; Clustering components; Industry applications; Clustering algorithms; Clustering trends; PATTERN-CLASSIFICATION; R PACKAGE; ALGORITHMS; SYSTEM; ICT; INFORMATION; EXPLORATION; INDICATORS; CHALLENGES; MANAGEMENT;
D O I
10.1007/s10462-022-10325-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering has primarily been used as an analytical technique to group unlabeled data for extracting meaningful information. The fact that no clustering algorithm can solve all clustering problems has resulted in the development of several clustering algorithms with diverse applications. We review data clustering, intending to underscore recent applications in selected industrial sectors and other notable concepts. In this paper, we begin by highlighting clustering components and discussing classification terminologies. Furthermore, specific, and general applications of clustering are discussed. Notable concepts on clustering algorithms, emerging variants, measures of similarities/dissimilarities, issues surrounding clustering optimization, validation and data types are outlined. Suggestions are made to emphasize the continued interest in clustering techniques both by scholars and Industry practitioners. Key findings in this review show the size of data as a classification criterion and as data sizes for clustering become larger and varied, the determination of the optimal number of clusters will require new feature extracting methods, validation indices and clustering techniques. In addition, clustering techniques have found growing use in key industry sectors linked to the sustainable development goals such as manufacturing, transportation and logistics, energy, and healthcare, where the use of clustering is more integrated with other analytical techniques than a stand-alone clustering technique.
引用
收藏
页码:6439 / 6475
页数:37
相关论文
共 50 条
  • [31] DBSCAN and CLARA Clustering Algorithms and their usage for the Soil Data Clustering
    Vukcevic, M.
    Popovic-Bugarin, V.
    Dervic, E.
    2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 456 - 461
  • [32] An Effective Crow Search Algorithm and Its Application in Data Clustering
    Ranjan, Rajesh
    Chhabra, Jitender Kumar
    JOURNAL OF CLASSIFICATION, 2025, 42 (01) : 134 - 162
  • [33] A New Line Symmetry Distance and Its Application to Data Clustering
    Sriparna Saha
    Sanghamitra Bandyopadhyay
    Journal of Computer Science and Technology, 2009, 24 : 544 - 556
  • [34] Knowledge discovery with clustering based on rules by states: A water treatment application
    Gibert, K.
    Rodriguez-Silva, G.
    Rodriguez-Roda, I.
    ENVIRONMENTAL MODELLING & SOFTWARE, 2010, 25 (06) : 712 - 723
  • [35] Application and visualization of typical clustering algorithms in seismic data analysis
    Fan, Z.
    Xu, X.
    10TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2019) / THE 2ND INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40 2019) / AFFILIATED WORKSHOPS, 2019, 151 : 171 - 178
  • [36] A New Line Symmetry Distance and Its Application to Data Clustering
    Sriparna Saha
    Sanghamitra Bandyopadhyay
    Journal of Computer Science & Technology, 2009, 24 (03) : 544 - 556
  • [37] A comprehensive review of energy blockchain: Application scenarios and development trends
    Teng, Fei
    Zhang, Qi
    Wang, Ge
    Liu, Jiangfeng
    Li, Hailong
    INTERNATIONAL JOURNAL OF ENERGY RESEARCH, 2021, 45 (12) : 17515 - 17531
  • [38] Learning-based EM clustering for data on the unit hypersphere with application to exoplanet data
    Yang, Miin-Shen
    Chang-Chien, Shou-Jen
    Hung, Wen-Liang
    APPLIED SOFT COMPUTING, 2017, 60 : 101 - 114
  • [39] What Can We Learn from the Functional Clustering of Mortality Data? An Application to the Human Mortality Database
    Leger, Ainhoa-Elena
    Mazzuco, Stefano
    EUROPEAN JOURNAL OF POPULATION-REVUE EUROPEENNE DE DEMOGRAPHIE, 2021, 37 (4-5): : 769 - 798
  • [40] Bio-inspired Clustering: basic features and future trends in the era of Big Data
    Camacho, David
    2015 IEEE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2015, : 1 - 6