Symbolic Clustering with Interval-Valued Data

被引:19
作者
Sato-Ilic, Mika [1 ]
机构
[1] Univ Tsukuba, Fac Syst & Informat Engn, Tsukuba, Ibaraki 3058573, Japan
来源
COMPLEX ADAPTIVE SYSTEMS | 2011年 / 6卷
关键词
symbolic data; interval-valued data; clustering; high dimension low sample-size data; subspace of variables;
D O I
10.1016/j.procs.2011.08.066
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
While many clustering techniques for interval-valued data have been proposed, there has been no proposal for a variable selection added fuzzy clustering method for high dimension low sample-size interval-valued data. This paper proposes such a novel fuzzy clustering method for interval-valued data with an adaptable variable selection. There are three reasons why the method is necessary: First, our target data in this study is high dimension low sample-size data. Due to the curse of dimensionality, we tend to obtain a poor classification result for this type of data. The main cause of this is noise occurring from irrelevant and redundant variables (dimensions). Therefore, we need to use an adaptable variable selection to reduce or summarize variables. Second, the merit of fuzzy clustering is to obtain the results with uncertain cluster boundaries, which is well adjusted with the uncertainty situation of classification to data. This gives a more robust result for the noise of data when compared with hard clustering while mathematically we can obtain a result with continuous values. Third, an adaptable representation of interval-valued data can be exploited to transform the original data into a more manageable data in order to avoid the curse of dimensionality. Numerical examples show a high performance for the proposed method. (C) 2011 Published by Elsevier B. V.
引用
收藏
页数:6
相关论文
共 9 条
[1]  
[Anonymous], FUZZY LOGIC SOFT COM
[2]  
Billard L., 2007, Symbolic Data Analysis: Conceptual Statistics and Data Mining
[3]  
Bock HH., 2000, ANAL SYMBOLIC DATA
[4]   Clustering objects on subsets of attributes [J].
Friedman, JH ;
Meulman, JJ .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2004, 66 :815-839
[5]  
Hastie T., 2009, ELEMENTS STAT LEARNI, DOI 10.1007/978-0-387-84858-7
[6]  
Kaufman L., 2009, Finding Groups in Data: An Introduction to Cluster Analysis
[7]  
Mika S., 2008, INT J INTELLIGENT TE, V1, P1
[8]  
Sato-Ilic M, 2010, 19 INT C COMPUTATION, P1605
[9]  
Welsh JB, 2001, CANCER RES, V61, P5974