Data clustering: application and trends

被引:72
|
作者
Oyewole, Gbeminiyi John [1 ]
Thopil, George Alex [1 ]
机构
[1] Univ Pretoria, Dept Engn & Technol Management, Pretoria, South Africa
关键词
Clustering; Clustering classification; Clustering components; Industry applications; Clustering algorithms; Clustering trends; PATTERN-CLASSIFICATION; R PACKAGE; ALGORITHMS; SYSTEM; ICT; INFORMATION; EXPLORATION; INDICATORS; CHALLENGES; MANAGEMENT;
D O I
10.1007/s10462-022-10325-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering has primarily been used as an analytical technique to group unlabeled data for extracting meaningful information. The fact that no clustering algorithm can solve all clustering problems has resulted in the development of several clustering algorithms with diverse applications. We review data clustering, intending to underscore recent applications in selected industrial sectors and other notable concepts. In this paper, we begin by highlighting clustering components and discussing classification terminologies. Furthermore, specific, and general applications of clustering are discussed. Notable concepts on clustering algorithms, emerging variants, measures of similarities/dissimilarities, issues surrounding clustering optimization, validation and data types are outlined. Suggestions are made to emphasize the continued interest in clustering techniques both by scholars and Industry practitioners. Key findings in this review show the size of data as a classification criterion and as data sizes for clustering become larger and varied, the determination of the optimal number of clusters will require new feature extracting methods, validation indices and clustering techniques. In addition, clustering techniques have found growing use in key industry sectors linked to the sustainable development goals such as manufacturing, transportation and logistics, energy, and healthcare, where the use of clustering is more integrated with other analytical techniques than a stand-alone clustering technique.
引用
收藏
页码:6439 / 6475
页数:37
相关论文
共 50 条
  • [41] Application of Big Data in Smart Grid
    Lai, Chun Sing
    Lai, Loi Lei
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 665 - 670
  • [42] Automatic aspect discrimination in data clustering
    Horta, Danilo
    Campello, Ricardo J. G. B.
    PATTERN RECOGNITION, 2012, 45 (12) : 4370 - 4388
  • [43] Data Clustering: Algorithms and Its Applications
    Oyelade, Jelili
    Isewon, Itunuoluwa
    Oladipupo, Olufunke
    Emebo, Onyeka
    Omogbadegun, Zacchaeus
    Aromolaran, Olufemi
    Uwoghiren, Efosa
    Olaniyan, Damilare
    Olawole, Obembe
    2019 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ITS APPLICATIONS (ICCSA 2019), 2019, : 71 - 81
  • [44] MVStream: Multiview Data Stream Clustering
    Huang, Ling
    Wang, Chang-Dong
    Chao, Hong-Yang
    Yu, Philip S.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (09) : 3482 - 3496
  • [45] A survey on data stream clustering and classification
    Hai-Long Nguyen
    Woon, Yew-Kwong
    Ng, Wee-Keong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (03) : 535 - 569
  • [46] Analysis of K-means clustering for Human Capital Trends
    Sharma, Gamini
    Sharma, Manish Kumar
    Sharma, Dakshata
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ICT IN BUSINESS INDUSTRY & GOVERNMENT (ICTBIG), 2016,
  • [47] Competitive algorithms for the clustering of noisy data
    Yang, TN
    Wang, SD
    FUZZY SETS AND SYSTEMS, 2004, 141 (02) : 281 - 299
  • [48] The Influence of Data Quality on Clustering Outcomes
    Sivogolovko, Elena
    DATABASES AND INFORMATION SYSTEMS VII, 2013, 249 : 95 - 105
  • [49] On the effective clustering of multidimensional data sequences
    Lee, SL
    Chung, CW
    INFORMATION PROCESSING LETTERS, 2001, 80 (02) : 87 - 95
  • [50] Divisive approach of Clustering for Educational Data
    Lahane, Sunita V.
    Kharat, M. U.
    Halgaonkar, Prasad S.
    PROCEEDINGS OF THE 2012 FIFTH INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING AND TECHNOLOGY (ICETET 2012), 2012, : 191 - 195