High Dimensional Sparse data Clustering Algorithm Based on Concept Feature Vector (CABOCFV)

被引:0
|
作者
Wu, Sen [1 ]
Gu, Shujuan [1 ]
Gao, Xuedong [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Econ & Management, Beijing 100083, Peoples R China
来源
IEEE/SOLI'2008: PROCEEDINGS OF 2008 IEEE INTERNATIONAL CONFERENCE ON SERVICE OPERATIONS AND LOGISTICS, AND INFORMATICS, VOLS 1 AND 2 | 2008年
关键词
Clustering Analysis; High Dimensional Data; Concept Lattice Construction;
D O I
10.1109/SOLI.2008.4686391
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Finding clusters of data objects in high dimensional pace is challenging, especially considering that such data can be parse and highly skewed. This paper focuses on using Concept Lattice to solve high dimensional sparse data clustering problem. Concept Lattice Theory is an effective tool for data analysis and knowledge processing, which integrates the concept intent (attribute) and concept extent (object), and describes the hierarchical relationship of concept nodes. The construction of concept lattice itself is a process of concept clustering, but it produces a huge number of concept nodes due to its own completeness. Whereas we are not interested in the concept nodes whose extent is too large or too small. This paper proposes an effective high dimensional sparse data Clustering Algorithm Based On Concept Feature Vector (CABOCFV), which reduces the redundancy of concept construction using 'Concept Sparse Feature Distance' and 'Concept Feature Vector', and raises an effective noise recognition strategy. CABOCFV clustering algorithm is not susceptible to the input order of data objects, and scans the database only once. Experiments show that CABOCFV is effective and efficient for high dimensional sparse data clustering.
引用
收藏
页码:202 / 206
页数:5
相关论文
共 50 条
  • [1] High Dimensional Data Clustering Algorithm Based on Sparse Feature Vector for Categorical Attributes
    Wu, Sen
    Wei, Guiying
    PROCEEDINGS OF 2010 INTERNATIONAL CONFERENCE ON LOGISTICS SYSTEMS AND INTELLIGENT MANAGEMENT, VOLS 1-3, 2010, : 973 - 976
  • [2] CABOSFV algorithm for high dimensional sparse data clustering
    Sen Wu
    Xuedong Gao Management School
    Journal of University of Science and Technology Beijing(English Edition), 2004, (03) : 283 - 288
  • [3] CABOSFV algorithm for high dimensional sparse data clustering
    Wu, S
    Gao, XD
    JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY BEIJING, 2004, 11 (03): : 283 - 288
  • [4] Clustering Algorithm Based on Sparse Feature Vector without Specifying Parameter
    He, Huixia
    Wei, Guiying
    Wu, Sen
    Gao, Xiaonan
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2020, 27 (06): : 1974 - 1981
  • [5] Parallel clustering algorithm based on sparse index sort of high dimensional data
    Wu, Sen
    Feng, Xiao-Dong
    Wu, Qing-Hai
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2011, 31 (SUPPL. 2): : 13 - 18
  • [7] EFFECTIVE CLUSTERING ALGORITHM FOR HIGH-DIMENSIONAL SPARSE DATA BASED ON SOM
    Martinovic, Jan
    Slaninova, Katerina
    Vojacek, Lukas
    Drazdilova, Pavla
    Dvorsky, Jiri
    Vondrak, Ivo
    NEURAL NETWORK WORLD, 2013, 23 (02) : 131 - 147
  • [8] Clustering Algorithm Based On Sparse Feature Vector for Interval-Scaled Variables
    Wu, Sen
    Wei, Guiying
    Gu, Shujuan
    Ma, Xiaofang
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5561 - 5564
  • [9] A density-based clustering algorithm for high-dimensional data with feature selection
    Qi Xianting
    Wang Pan
    2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 114 - 118
  • [10] A clustering based on information granularity for high dimensional sparse data
    Zhao, YQ
    Zhou, XZ
    2005 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2005, : 363 - 366