High Dimensional Sparse data Clustering Algorithm Based on Concept Feature Vector (CABOCFV)

被引：0

作者：

Wu, Sen ^{[1
]}

Gu, Shujuan ^{[1
]}

Gao, Xuedong ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Econ & Management, Beijing 100083, Peoples R China

来源：

IEEE/SOLI'2008: PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON SERVICE OPERATIONS AND LOGISTICS, AND INFORMATICS, VOLS 1 AND 2 | 2008年

关键词：

Clustering Analysis; High Dimensional Data; Concept Lattice Construction;

D O I：

10.1109/SOLI.2008.4686391

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Finding clusters of data objects in high dimensional pace is challenging, especially considering that such data can be parse and highly skewed. This paper focuses on using Concept Lattice to solve high dimensional sparse data clustering problem. Concept Lattice Theory is an effective tool for data analysis and knowledge processing, which integrates the concept intent (attribute) and concept extent (object), and describes the hierarchical relationship of concept nodes. The construction of concept lattice itself is a process of concept clustering, but it produces a huge number of concept nodes due to its own completeness. Whereas we are not interested in the concept nodes whose extent is too large or too small. This paper proposes an effective high dimensional sparse data Clustering Algorithm Based On Concept Feature Vector (CABOCFV), which reduces the redundancy of concept construction using 'Concept Sparse Feature Distance' and 'Concept Feature Vector', and raises an effective noise recognition strategy. CABOCFV clustering algorithm is not susceptible to the input order of data objects, and scans the database only once. Experiments show that CABOCFV is effective and efficient for high dimensional sparse data clustering.

引用

页码：202 / 206

页数：5

共 50 条

[41] Sign-based Test for Mean Vector in High-dimensional and Sparse Settings
Liu, Wei
Li, Ying Qiu
ACTA MATHEMATICA SINICA-ENGLISH SERIES, 2020, 36 (01) : 93 - 108
[42] Sign-based Test for Mean Vector in High-dimensional and Sparse Settings
Wei Liu
Ying Qiu Li
Acta Mathematica Sinica, English Series, 2020, 36 : 93 - 108
[43] Clustering High-Dimensional Data: A Survey on Subspace Clustering, Pattern-Based Clustering, and Correlation Clustering
Kriegel, Hans-Peter
Kroeger, Peer
Zimek, Arthur
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (01)
[44] Enhancing grid-density based clustering for high dimensional data
Zhao, Yanchang
Cao, Jie
Zhang, Chengqi
Zhang, Shichao
JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (09) : 1524 - 1539
[45] Model-based clustering of high-dimensional data: A review
Bouveyron, Charles
Brunet-Saumard, Camille
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 52 - 78
[46] ASCRClu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data
Fatehi, Kavan
Rezvani, Mohsen
Fateh, Mansoor
PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (04) : 1651 - 1663
[47] ASCRClu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data
Kavan Fatehi
Mohsen Rezvani
Mansoor Fateh
Pattern Analysis and Applications, 2020, 23 : 1651 - 1663
[48] Feature selection based on geometric distance for high-dimensional data
Lee, J. -H.
Oh, S. -Y.
ELECTRONICS LETTERS, 2016, 52 (06) : 473 - 474
[49] FACO: A Novel Hybrid Feature Selection Algorithm for High-Dimensional Data Classification
Popoola, Gideon
Oyeniran, Kayode
SOUTHEASTCON 2024, 2024, : 61 - 68
[50] Adaptive threshold-based classification of sparse high-dimensional data
Pavlenko, Tatjana
Stepanova, Natalia
Thompson, Lee
ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 1952 - 1996

← 1 2 3 4 5 →