High Dimensional Sparse data Clustering Algorithm Based on Concept Feature Vector (CABOCFV)

被引:0
|
作者
Wu, Sen [1 ]
Gu, Shujuan [1 ]
Gao, Xuedong [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Econ & Management, Beijing 100083, Peoples R China
来源
IEEE/SOLI'2008: PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON SERVICE OPERATIONS AND LOGISTICS, AND INFORMATICS, VOLS 1 AND 2 | 2008年
关键词
Clustering Analysis; High Dimensional Data; Concept Lattice Construction;
D O I
10.1109/SOLI.2008.4686391
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Finding clusters of data objects in high dimensional pace is challenging, especially considering that such data can be parse and highly skewed. This paper focuses on using Concept Lattice to solve high dimensional sparse data clustering problem. Concept Lattice Theory is an effective tool for data analysis and knowledge processing, which integrates the concept intent (attribute) and concept extent (object), and describes the hierarchical relationship of concept nodes. The construction of concept lattice itself is a process of concept clustering, but it produces a huge number of concept nodes due to its own completeness. Whereas we are not interested in the concept nodes whose extent is too large or too small. This paper proposes an effective high dimensional sparse data Clustering Algorithm Based On Concept Feature Vector (CABOCFV), which reduces the redundancy of concept construction using 'Concept Sparse Feature Distance' and 'Concept Feature Vector', and raises an effective noise recognition strategy. CABOCFV clustering algorithm is not susceptible to the input order of data objects, and scans the database only once. Experiments show that CABOCFV is effective and efficient for high dimensional sparse data clustering.
引用
收藏
页码:202 / 206
页数:5
相关论文
共 50 条
  • [21] Clustering algorithm of high-dimensional data based on units
    School of In formation Engineering, Hubei Institute for Nationalities, Enshi 445000, China
    Jisuanji Yanjiu yu Fazhan, 2007, 9 (1618-1623): : 1618 - 1623
  • [22] Integrated constraint based clustering algorithm for high dimensional data
    Liu, Xinyue
    Li, Menggang
    NEUROCOMPUTING, 2014, 142 : 478 - 485
  • [23] FEATURE CLUSTERING FOR PSO-BASED FEATURE CONSTRUCTION ON HIGH-DIMENSIONAL DATA
    Swesi, Idheba Mohamad Ali Omer
    Abu Bakar, Azuraliza
    JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA, 2019, 18 (04): : 439 - 472
  • [24] Clustering high dimensional sparse transactional data with constraints
    Li, Yanrong
    Gopalan, Raj P.
    2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 692 - +
  • [25] Bidirectional CABOSFV for high dimensional sparse data clustering
    Gao, Xuedong
    Yang, Minghan
    Li, Ling
    2016 INTERNATIONAL CONFERENCE ON LOGISTICS, INFORMATICS AND SERVICE SCIENCES (LISS' 2016), 2016,
  • [26] Model-based Co-clustering for High Dimensional Sparse Data
    Salah, Aghiles
    Rogovschi, Nicoleta
    Nadif, Mohamed
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 866 - 874
  • [27] A Sparse Genetic Algorithm to Solve Feature Selection of Sparse High-dimensional Data and Liver Totxicity Classification
    Liu, Yu
    Wang, Jie-Sheng
    Wen, Jia-Yao
    Li, Yu-Tong
    Yan, Peng-Guo
    ENGINEERING LETTERS, 2025, 33 (04) : 1045 - 1060
  • [28] A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy
    Du, Xinzhi
    ENTROPY, 2023, 25 (03)
  • [29] Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm
    Chakraborty, Saptarshi
    Paul, Debolina
    Das, Swagatam
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 6930 - 6938
  • [30] Using Feature Clustering for GP-Based Feature Construction on High-Dimensional Data
    Binh Tran
    Xue, Bing
    Zhang, Mengjie
    GENETIC PROGRAMMING, EUROGP 2017, 2017, 10196 : 210 - 226