Clustering Data of Mixed Categorical and Numerical Type With Unsupervised Feature Learning

被引:46
作者
Lam, Dao [1 ]
Wei, Mingzhen [2 ]
Wunsch, Donald [1 ]
机构
[1] Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Appl Computat Intelligence Lab, Rolla, MO 65401 USA
[2] Missouri Univ Sci & Technol, Dept Geol Sci & Engn, Rolla, MO 65409 USA
来源
IEEE ACCESS | 2015年 / 3卷
关键词
Clustering; unsupervised feature learning; mixed-type data; fuzzy ART; ALGORITHM;
D O I
10.1109/ACCESS.2015.2477216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mixed-type categorical and numerical data are a challenge in many applications. This general area of mixed-type data is among the frontier areas, where computational intelligence approaches are often brittle compared with the capabilities of living creatures. In this paper, unsupervised feature learning (UFL) is applied to the mixed-type data to achieve a sparse representation, which makes it easier for clustering algorithms to separate the data. Unlike other UFL methods that work with homogeneous data, such as image and video data, the presented UFL works with the mixed-type data using fuzzy adaptive resonance theory (ART). UFL with fuzzy ART (UFLA) obtains a better clustering result by removing the differences in treating categorical and numeric features. The advantages of doing this are demonstrated with several real world data sets with ground truth, including heart disease, teaching assistant evaluation, and credit approval. The approach is also demonstrated on noisy, mixed-type petroleum industry data. UFLA is compared with several alternative methods. To the best of our knowledge, this is the first time UFL has been extended to accomplish the fusion of mixed data types.
引用
收藏
页码:1605 / 1613
页数:9
相关论文
共 40 条
[1]  
[Anonymous], 2008, Advances in Neural Information Processing Systems
[2]  
[Anonymous], 2005, PROC IEEE COMPUT SOC
[3]  
[Anonymous], 2011, NEURAL INFORM PROCES
[4]  
[Anonymous], CONCEPT FORMATION KN
[5]  
Bache K, 2013, UCI machine learning repository
[6]  
Bengio Yoshua, 2006, Advances in Neural Information Processing Systems 19, V19, P153
[7]   Visual assessment of clustering tendency for rectangular dissimilarity matrices [J].
Bezdek, James C. ;
Hathaway, Richard J. ;
Huband, Jacalyn M. .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2007, 15 (05) :890-903
[8]   FUZZY ART - FAST STABLE LEARNING AND CATEGORIZATION OF ANALOG PATTERNS BY AN ADAPTIVE RESONANCE SYSTEM [J].
CARPENTER, GA ;
GROSSBERG, S ;
ROSEN, DB .
NEURAL NETWORKS, 1991, 4 (06) :759-771
[9]   New clustering methods for interval data [J].
Chavent, Marie ;
de Carvalho, Francisco de A. T. ;
Lechevallier, Yves ;
Verde, Rosanna .
COMPUTATIONAL STATISTICS, 2006, 21 (02) :211-229
[10]   Unsupervised Feature Learning for Aerial Scene Classification [J].
Cheriyadat, Anil M. .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2014, 52 (01) :439-451