Symbolic clustering of large datasets

被引:2
作者
Lechevallier, Yves
Verde, Rosanna [1 ]
de Carvalho, Francisco de A. T. [2 ]
机构
[1] Seconda Univ Napoli, Dip Strateg Aziendali & Metod Quantit, I-81043 Capua, CE, Italy
[2] Cidade Univ, Ctr Informat, BR-50740540 Recife, PE, Brazil
来源
DATA SCIENCE AND CLASSIFICATION | 2006年
关键词
D O I
10.1007/3-540-34416-0_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach to cluster large datasets that integrates the Kohonen Self Organizing Maps (SOM) with a dynamic clustering algorithm of symbolic data (SCLUST). A preliminary data reduction using SOM algorithm is performed. As a result, the individual measurements are replaced by micro-clusters. These micro-clusters are then grouped in a few clusters which are modeled by symbolic objects. By computing the extension of these symbolic objects, symbolic clustering algorithm allows discovering the natural classes. An application on a real data set shows the usefulness of this methodology.
引用
收藏
页码:193 / +
页数:3
相关论文
共 50 条
  • [41] A fast fuzzy clustering algorithm for large-scale datasets
    Shi, LK
    He, PL
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 203 - 208
  • [42] Vector quantization based approximate spectral clustering of large datasets
    Tademir, Kadim
    PATTERN RECOGNITION, 2012, 45 (08) : 3034 - 3044
  • [43] Tight clustering for large datasets with an application to gene expression data
    Bikram Karmakar
    Sarmistha Das
    Sohom Bhattacharya
    Rohan Sarkar
    Indranil Mukhopadhyay
    Scientific Reports, 9
  • [44] Tight clustering for large datasets with an application to gene expression data
    Karmakar, Bikram
    Das, Sarmistha
    Bhattacharya, Sohom
    Sarkar, Rohan
    Mukhopadhyay, Indranil
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [45] A clustering scheme for large high-dimensional document datasets
    Jiang, Jung-Yi
    Chen, Jing-Wen
    Lee, Shie-Jue
    ADVANCES IN COMPUTATION AND INTELLIGENCE, PROCEEDINGS, 2007, 4683 : 511 - 519
  • [46] Distributed Sketched Subspace Clustering for Large-scale Datasets
    Traganitis, Panagiotis A.
    Giannakis, Georgios B.
    2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2017,
  • [47] A modified hyperplane clustering algorithm allows for efficient and accurate clustering of extremely large datasets
    Sharma, Ashok
    Podolsky, Robert
    Zhao, Jieping
    McIndoe, Richard A.
    BIOINFORMATICS, 2009, 25 (09) : 1152 - 1157
  • [48] Systematic Review of Clustering High-Dimensional and Large Datasets
    Pandove, Divya
    Goel, Shivani
    Rani, Rinkle
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (02)
  • [49] An Incremental Density-Based Clustering Technique for Large Datasets
    Rehman, Saif Ur
    Khan, Muhammed Naeem Ahmed
    COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS 2010, 2010, 85 : 3 - 11
  • [50] CLIC: clustering analysis of large microarray datasets with individual dimension-based clustering
    Yun, Taegyun
    Hwang, Taeho
    Cha, Kihoon
    Yi, Gwan-Su
    NUCLEIC ACIDS RESEARCH, 2010, 38 : W246 - W253