CABOSFV algorithm for high dimensional sparse data clustering

被引:6
作者
Wu, S [1 ]
Gao, XD [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Management, Beijing 100083, Peoples R China
来源
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING | 2004年 / 11卷 / 03期
关键词
clustering; data mining; sparse; high dimensionality;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
引用
收藏
页码:283 / 288
页数:6
相关论文
共 50 条
  • [21] A novel algorithm for fast and scalable subspace clustering of high-dimensional data
    Kaur A.
    Datta A.
    Journal of Big Data, 2015, 2 (01)
  • [22] Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer
    Ozyer, Tansel
    Alhajj, Reda
    APPLIED INTELLIGENCE, 2009, 31 (03) : 318 - 331
  • [23] GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases
    Pilevar, AH
    Sukumar, M
    PATTERN RECOGNITION LETTERS, 2005, 26 (07) : 999 - 1010
  • [24] Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer
    Tansel Özyer
    Reda Alhajj
    Applied Intelligence, 2009, 31 : 318 - 331
  • [25] The Border K-Means Clustering Algorithm for One Dimensional Data
    Froese, Ryan
    Klassen, James W.
    Leung, Carson K.
    Loewen, Tyler S.
    2022 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (IEEE BIGCOMP 2022), 2022, : 35 - 42
  • [26] Fast clustering algorithm of commodity association big data sparse network
    Hailan Pan
    Xiaohuan Yang
    International Journal of System Assurance Engineering and Management, 2021, 12 : 667 - 674
  • [27] The k-prototype algorithm of clustering high dimensional and large scale mixed data
    Liu, Hui
    Dai, Bo
    He, Hui
    Yan, Yang
    WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING, VOL 1 AND 2, 2006, : 738 - +
  • [28] Fast clustering algorithm of commodity association big data sparse network
    Pan, Hailan
    Yang, Xiaohuan
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2021, 12 (04) : 667 - 674
  • [29] A Novel K-Means Based Clustering Algorithm for High Dimensional Data Sets
    Khalilian, Madjid
    Mustapha, Norwati
    Suliman, Nasir
    Mamat, Ali
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS (IMECS 2010), VOLS I-III, 2010, : 503 - +
  • [30] Semisupervised clustering algorithm combining SUBCLU and constrained clustering for detecting groups in high dimensional datasets
    Alexander Calvo-Valverde, Luis
    Vallejos-Pena, Alonso
    TECNOLOGIA EN MARCHA, 2018, 31 (03): : 74 - 85