Enhancing grid-density based clustering for high dimensional data

被引:21
|
作者
Zhao, Yanchang
Cao, Jie [1 ]
Zhang, Chengqi [2 ]
Zhang, Shichao [3 ]
机构
[1] Nanjing Univ Finance & Econ, Jiangsu Prov Key Lab E Business, Nanjing 210003, Peoples R China
[2] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Fac Engn & Informat Technol, Sydney, NSW 2007, Australia
[3] Guangxi Normal Univ, Coll CS & IT, Guilin, Australia
关键词
Clustering; Subspace clustering; High dimensional data; ALGORITHM;
D O I
10.1016/j.jss.2011.02.047
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose an enhanced grid-density based approach for clustering high dimensional data. Our technique takes objects (or points) as atomic units in which the size requirement to cells is waived without losing clustering accuracy. For efficiency, a new partitioning is developed to make the number of cells smoothly adjustable; a concept of the ith-order neighbors is defined for avoiding considering the exponential number of neighboring cells; and a novel density compensation is proposed for improving the clustering accuracy and quality. We experimentally evaluate our approach and demonstrate that our algorithm significantly improves the clustering accuracy and quality. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:1524 / 1539
页数:16
相关论文
共 50 条
  • [11] A density-grid-based method for clustering k-dimensional data
    Elham S. Kashani
    Saeed Bagheri Shouraki
    Yaser Norouzi
    Bernard De Baets
    Applied Intelligence, 2023, 53 : 10559 - 10573
  • [12] Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Zimek, Arthur
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (01)
  • [13] A Systematic Review of Density Grid-Based Clustering for Data Streams
    Tareq, Mustafa
    Sundararajan, Elankovan A.
    Harwood, Aaron
    Abu Bakar, Azuraliza
    IEEE ACCESS, 2022, 10 : 579 - 596
  • [14] Stream Data Clustering Based on Grid Density and Attraction
    Tu, Li
    Chen, Yixin
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
  • [15] Incomplete high dimensional data streams clustering
    Najib, Fatma M.
    Ismail, Rasha M.
    Badr, Nagwa L.
    Gharib, Tarek F.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (03) : 4227 - 4243
  • [16] GACH: a grid-based algorithm for hierarchical clustering of high-dimensional data
    Mansoori, Eghbal G.
    SOFT COMPUTING, 2014, 18 (05) : 905 - 922
  • [17] GACH: a grid-based algorithm for hierarchical clustering of high-dimensional data
    Eghbal G. Mansoori
    Soft Computing, 2014, 18 : 905 - 922
  • [18] Clustering for High Dimensional Data
    Sharma, Varun Kumar
    Bala, Anju
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 365 - 369
  • [19] A Clustering Algorithm Based on Density-Grid for Stream Data
    Zhang, Dandan
    Tian, Hui
    Sang, Yingpeng
    Li, Yidong
    Wu, Yanbo
    Wu, Jun
    Shen, Hong
    2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, : 398 - 403
  • [20] A Trajectory Data Clustering Method Based On Dynamic Grid Density
    Li, Junhuai
    Yang, Mengmeng
    Liu, Na
    Wang, Zhixiao
    Yu, Lei
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2015, 8 (02): : 1 - 8