CABOSFV algorithm for high dimensional sparse data clustering

被引:6
作者
Wu, S [1 ]
Gao, XD [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Management, Beijing 100083, Peoples R China
来源
JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING | 2004年 / 11卷 / 03期
关键词
clustering; data mining; sparse; high dimensionality;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV), was proposed for the high dimensional clustering of binary sparse data. This algorithm compresses the data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scale enormously, and can get the clustering result with only one data scan. Both theoretical analysis and empirical tests showed that CABOSFV is of low computational complexity. The algorithm finds clusters in high dimensional large datasets efficiently and handles noise effectively.
引用
收藏
页码:283 / 288
页数:6
相关论文
共 50 条
[31]   AGRID: An efficient algorithm for clustering large high-dimensional datasets [J].
Zhao, YC ;
Song, JD .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 :271-282
[32]   Clustering high dimensional data using SVM [J].
Lin, Tsau Young ;
Ngo, Tam .
ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2007, 4482 :256-+
[33]   Automatic Subspace Clustering of High Dimensional Data [J].
Rakesh Agrawal ;
Johannes Gehrke ;
Dimitrios Gunopulos ;
Prabhakar Raghavan .
Data Mining and Knowledge Discovery, 2005, 11 :5-33
[34]   Automatic subspace clustering of high dimensional data [J].
Agrawal, R ;
Gehrke, J ;
Gunopulos, D ;
Raghavan, P .
DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (01) :5-33
[35]   Sparse Data for Document Clustering [J].
Veritawati, Ionia ;
Wasito, Ito ;
Mujiono .
2013 INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY (ICOICT), 2013, :38-43
[36]   A Novel Approach for High Dimensional Data Clustering [J].
Alijamaat, Ali ;
Khalilian, Madjid ;
Mustapha, Norwati .
THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, :264-267
[37]   Comprehensive review on Clustering Techniques and its application on High Dimensional Data [J].
Alam, Afroj ;
Muqeem, Mohd ;
Ahmad, Sultan .
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (06) :237-244
[38]   Optimal variable clustering for high-dimensional matrix valued data [J].
Lee, Inbeom ;
Deng, Siyi ;
Ning, Yang .
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2025, 14 (01)
[39]   Density-connected subspace clustering for high-dimensional data [J].
Kailing, K ;
Kriegel, HP ;
Kröger, P .
PROCEEDINGS OF THE FOURTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2004, :246-256
[40]   A Clustering Algorithm of High-Dimensional Data Based on Sequential Psim Matrix and Differential Truncation [J].
Wang, Gongming ;
Li, Wenfa ;
Xu, Weizhi .
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT II, 2018, 11335 :297-307