CABOSFV algorithm for high dimensional sparse data clustering

被引:0
|
作者
Wu, S [1 ]
Gao, XD [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Management, Beijing 100083, Peoples R China
来源
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING | 2004年 / 11卷 / 03期
关键词
clustering; data mining; sparse; high dimensionality;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
引用
收藏
页码:283 / 288
页数:6
相关论文
共 50 条
  • [11] Interpolation-based k-means Clustering Improvement for Sparse, High Dimensional Data
    Chen, Wanghu
    Tian, Zhen
    PROCEEDINGS OF 2019 3RD INTERNATIONAL CONFERENCE ON CLOUD AND BIG DATA COMPUTING (ICCBDC 2019), 2019, : 11 - 15
  • [12] Clustering algorithm of high-dimensional data based on units
    School of In formation Engineering, Hubei Institute for Nationalities, Enshi 445000, China
    Jisuanji Yanjiu yu Fazhan, 2007, 9 (1618-1623): : 1618 - 1623
  • [13] Visual interactive evolutionary algorithm for high dimensional outlier detection and data clustering problems
    Boudjeloud-Assala, Lydia
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2012, 4 (01) : 6 - 13
  • [14] Divisive clustering of high dimensional data streams
    Hofmeyr, David P.
    Pavlidis, Nicos G.
    Eckley, Idris A.
    STATISTICS AND COMPUTING, 2016, 26 (05) : 1101 - 1120
  • [15] Divisive clustering of high dimensional data streams
    David P. Hofmeyr
    Nicos G. Pavlidis
    Idris A. Eckley
    Statistics and Computing, 2016, 26 : 1101 - 1120
  • [16] High Dimensional Data Stream Clustering Algorithm Based on Random Projection
    Zhu Y.
    Chen S.
    Chen, Songcan (s.chen@nuaa.edu.cn), 1683, Science Press (57): : 1683 - 1696
  • [17] A Clustering Algorithm Based on Matrix over High Dimensional Data Stream
    Hou, Guibin
    Yao, Ruixia
    Ren, Jiadong
    Hu, Changzhen
    WEB INFORMATION SYSTEMS AND MINING, 2010, 6318 : 86 - +
  • [18] Persistent homology based clustering algorithm for high-dimensional data
    Xiong Z.
    Wei Y.
    Xiong Z.
    He K.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (02): : 29 - 35
  • [19] An intelligent clustering algorithm for high-dimensional multiview data in big data applications
    Tao, Qian
    Gu, Chunqin
    Wang, Zhenyu
    Jiang, Daoning
    NEUROCOMPUTING, 2020, 393 : 234 - 244
  • [20] GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases
    Pilevar, AH
    Sukumar, M
    PATTERN RECOGNITION LETTERS, 2005, 26 (07) : 999 - 1010