CABOSFV algorithm for high dimensional sparse data clustering

被引:6
作者
Wu, S [1 ]
Gao, XD [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Management, Beijing 100083, Peoples R China
来源
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING | 2004年 / 11卷 / 03期
关键词
clustering; data mining; sparse; high dimensionality;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
引用
收藏
页码:283 / 288
页数:6
相关论文
共 50 条
[41]   Density-connected subspace clustering for high-dimensional data [J].
Kailing, K ;
Kriegel, HP ;
Kröger, P .
PROCEEDINGS OF THE FOURTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2004, :246-256
[42]   A Clustering Algorithm of High-Dimensional Data Based on Sequential Psim Matrix and Differential Truncation [J].
Wang, Gongming ;
Li, Wenfa ;
Xu, Weizhi .
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT II, 2018, 11335 :297-307
[43]   SPARSE QUADRATIC DISCRIMINANT ANALYSIS FOR HIGH DIMENSIONAL DATA [J].
Li, Quefeng ;
Shao, Jun .
STATISTICA SINICA, 2015, 25 (02) :457-473
[44]   AUTOMATIC SPARSE PCA FOR HIGH-DIMENSIONAL DATA [J].
Yata, Kazuyoshi ;
Aoshima, Makoto .
STATISTICA SINICA, 2025, 35 (02) :1069-1090
[45]   Compression, clustering, and pattern discovery in very high-dimensional discrete-attribute data sets [J].
Koyutürk, M ;
Grama, A ;
Ramakrishnan, N .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) :447-461
[46]   A Sparse Structure Learning Algorithm for Gaussian Bayesian Network Identification from High-Dimensional Data [J].
Huang, Shuai ;
Li, Jing ;
Ye, Jieping ;
Fleisher, Adam ;
Chen, Kewei ;
Wu, Teresa ;
Reiman, Eric .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (06) :1328-1342
[47]   Clustering of imbalanced high-dimensional media data [J].
Brodinova, Sarka ;
Zaharieva, Maia ;
Filzmoser, Peter ;
Ortner, Thomas ;
Breiteneder, Christian .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (02) :261-284
[48]   Clustering High-Dimensional Noisy Categorical Data [J].
Tian, Zhiyi ;
Xu, Jiaming ;
Tang, Jen .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) :3008-3019
[49]   An Integrated Clustering Approach for High Dimensional Categorical Data [J].
Kalaivani, K. ;
Raghavendra, A. P. V. .
2013 IEEE INTERNATIONAL CONFERENCE ON GREEN HIGH PERFORMANCE COMPUTING (ICGHPC), 2013,
[50]   Time Series Clustering from High Dimensional Data [J].
Drago, Carlo ;
Scepi, Germana .
CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 :72-86