Hybrid Sampling-Based Clustering Ensemble With Global and Local Constitutions

被引:106
作者
Yang, Yun [1 ]
Jiang, Jianmin [2 ,3 ]
机构
[1] Yunnan Univ, Natl Pilot Sch Software, Kunming 650091, Peoples R China
[2] Shenzhen Univ, Sch Comp Sci & Software Engn, Res Inst Future Media Comp, Shenzhen 518060, Peoples R China
[3] Univ Surrey, Dept Comp, Guildford GU2 7XH, Surrey, England
基金
中国国家自然科学基金;
关键词
Clustering ensemble; consensus clustering; data clustering; sampling; unsupervised learning; CONSENSUS; MODEL;
D O I
10.1109/TNNLS.2015.2430821
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among a number of ensemble learning techniques, boosting and bagging are the most popular sampling-based ensemble approaches for classification problems. Boosting is considered stronger than bagging on noise-free data set with complex class structures, whereas bagging is more robust than boosting in cases where noise data are present. In this paper, we extend both ensemble approaches to clustering tasks, and propose a novel hybrid sampling-based clustering ensemble by combining the strengths of boosting and bagging. In our approach, the input partitions are iteratively generated via a hybrid process inspired by both boosting and bagging. Then, a novel consensus function is proposed to encode the local and global cluster structure of input partitions into a single representation, and applies a single clustering algorithm to such representation to obtain the consolidated consensus partition. Our approach has been evaluated on 2-D-synthetic data, collection of benchmarks, and real-world facial recognition data sets, which show that the proposed technique outperforms the existing benchmarks for a variety of clustering tasks.
引用
收藏
页码:952 / 965
页数:14
相关论文
共 61 条
[1]   Aggregating Inconsistent Information: Ranking and Clustering [J].
Ailon, Nir ;
Charikar, Moses ;
Newman, Alantha .
JOURNAL OF THE ACM, 2008, 55 (05)
[2]  
Analoui M., 2007, INTELLIGENT INFORM P, VIII, P227
[3]  
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[4]  
[Anonymous], 2013, UCR TIME SERIES BENC
[5]  
[Anonymous], 2013, MACHINE LEARNING DAT
[6]  
[Anonymous], 2013, YALE FACE DATABASE B
[7]  
[Anonymous], 2004, Int. J. Comput. Intell, DOI DOI 10.1103/PHYSREVD.77.085025
[8]  
[Anonymous], 2013, UCI MACHINE LEARNING
[9]  
[Anonymous], COMP VIS WINT WORKSH
[10]  
[Anonymous], 2007, ACM Transactions on Knowledge Discovery from Data, DOI [DOI 10.1145/1217299.1217303, 10.1145/1217299.1217303]