Enhancing grid-density based clustering for high dimensional data

被引:21
作者
Zhao, Yanchang
Cao, Jie [1 ]
Zhang, Chengqi [2 ]
Zhang, Shichao [3 ]
机构
[1] Nanjing Univ Finance & Econ, Jiangsu Prov Key Lab E Business, Nanjing 210003, Peoples R China
[2] Univ Technol Sydney, Ctr Quantum Computat & Intelligent Syst, Fac Engn & Informat Technol, Sydney, NSW 2007, Australia
[3] Guangxi Normal Univ, Coll CS & IT, Guilin, Australia
关键词
Clustering; Subspace clustering; High dimensional data; ALGORITHM;
D O I
10.1016/j.jss.2011.02.047
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose an enhanced grid-density based approach for clustering high dimensional data. Our technique takes objects (or points) as atomic units in which the size requirement to cells is waived without losing clustering accuracy. For efficiency, a new partitioning is developed to make the number of cells smoothly adjustable; a concept of the ith-order neighbors is defined for avoiding considering the exponential number of neighboring cells; and a novel density compensation is proposed for improving the clustering accuracy and quality. We experimentally evaluate our approach and demonstrate that our algorithm significantly improves the clustering accuracy and quality. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:1524 / 1539
页数:16
相关论文
共 50 条
[31]   Clustering over data streams based on grid density and index tree [J].
Ren J. ;
Cai B. ;
Hu C. .
Journal of Convergence Information Technology, 2011, 6 (01) :83-93
[32]   High-Dimensional Grid-based Clustering for Multispectral Satellite Image Segmentation [J].
Rylov, Sergey .
2020 VI INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND NANOTECHNOLOGY (IEEE ITNT-2020), 2020,
[33]   Cooperative clustering based on grid and density [J].
Hu, Ruifei ;
Yin, Guofu ;
Tan, Ying ;
Cai, Peng .
Chinese Journal of Mechanical Engineering (English Edition), 2006, 19 (04) :544-547
[34]   Iterative random projections for high-dimensional data clustering [J].
Cardoso, Angelo ;
Wichert, Andreas .
PATTERN RECOGNITION LETTERS, 2012, 33 (13) :1749-1755
[35]   Persistent homology based clustering algorithm for high-dimensional data [J].
Xiong Z. ;
Wei Y. ;
Xiong Z. ;
He K. .
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (02) :29-35
[36]   Model-based clustering of high-dimensional data: A review [J].
Bouveyron, Charles ;
Brunet-Saumard, Camille .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 :52-78
[37]   A rough set based subspace clustering technique for high dimensional data [J].
Lakshmi, B. Jaya ;
Shashi, M. ;
Madhuri, K. B. .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (03) :329-334
[38]   High Dimensional Data Stream Clustering Algorithm Based on Random Projection [J].
Zhu Y. ;
Chen S. .
Chen, Songcan (s.chen@nuaa.edu.cn), 1683, Science Press (57) :1683-1696
[39]   A Novel Intelligent Clustering Approach for High Dimensional Data in a Big Data Environment [J].
Tao, Qian ;
Wang, Zhenyu ;
Gu, Chunqin ;
Chen, Wenyuan ;
Lin, Weiqiang ;
Lin, Haojie .
2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,
[40]   A Novel Density-Based Clustering Approach for Outlier Detection in High-Dimensional Data [J].
Messaoud, Thouraya Aouled ;
Smiti, Abir ;
Louati, Aymen .
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 :322-331