A density-adaptive affinity propagation clustering algorithm based on spectral dimension reduction

被引:30
作者
Jia, Hongjie [1 ,2 ]
Ding, Shifei [1 ,2 ]
Meng, Lingheng [1 ]
Fan, Shuyan [1 ]
机构
[1] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Spectral dimension reduction; Distance measure; Similarity matrix; Affinity propagation clustering; PREDICTION;
D O I
10.1007/s00521-014-1628-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a novel clustering method, affinity propagation (AP) clustering can identify high-quality cluster centers by passing messages between data points. But its ultimate cluster number is affected by a user-defined parameter called self-confidence. When aiming at a given number of clusters due to prior knowledge, AP has to be launched many times until an appropriate setting of self-confidence is found. K-AP algorithm overcomes this disadvantage by introducing a constraint in the process of message passing to exploit the immediate results of K clusters. The key to K-AP clustering is constructing a suitable similarity matrix, which can truly reflect the intrinsic structure of the dataset. In this paper, a density-adaptive similarity measure is designed to describe the relations between data points more reasonably. Meanwhile, in order to solve the difficulties faced by K-AP algorithm in high-dimensional data sets, we use the dimension reduction method based on spectral graph theory to map the original data points to a low-dimensional eigenspace and propose a density-adaptive AP clustering algorithm based on spectral dimension reduction. Experiments show that the proposed algorithm can effectively deal with the clustering problem of datasets with complex structure and multiple scales, avoiding the singularity problem caused by the high-dimensional eigenvectors. Its clustering performance is better than AP clustering algorithm and K-AP algorithm.
引用
收藏
页码:1557 / 1567
页数:11
相关论文
共 22 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]  
Bach FR, 2004, ADV NEUR IN, V16, P305
[3]   Efficient eigen-updating for spectral graph clustering [J].
Dhanjal, Charanpal ;
Gaudel, Romaric ;
Clemencon, Stephan .
NEUROCOMPUTING, 2014, 131 :440-452
[4]   Research of semi-supervised spectral clustering algorithm based on pairwise constraints [J].
Ding, Shifei ;
Jia, Hongjie ;
Zhang, Liwen ;
Jin, Fengxiang .
NEURAL COMPUTING & APPLICATIONS, 2014, 24 (01) :211-219
[5]   Research of semi-supervised spectral clustering based on constraints expansion [J].
Ding, Shifei ;
Qi, Bingjuan ;
Jia, Hongjie ;
Zhu, Hong ;
Zhang, Liwen .
NEURAL COMPUTING & APPLICATIONS, 2013, 22 :S405-S410
[6]  
[董俊 Dong Jun], 2010, [电子与信息学报, Journal of Electronics & Information Technology], V32, P509
[7]   A Regularized Approach for Geodesic-Based Semisupervised Multimanifold Learning [J].
Fan, Mingyu ;
Zhang, Xiaoqin ;
Lin, Zhouchen ;
Zhang, Zhongfei ;
Bao, Hujun .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (05) :2133-2147
[8]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976
[9]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[10]   Estrogen receptor status prediction by gene component regression: a comparative study [J].
Huang, Chi-Cheng ;
Tu, Shih-Hsin ;
Lien, Heng-Hui ;
Jeng, Jaan-Yeh ;
Liu, Jung-Sen ;
Huang, Ching-Shui ;
Lai, Liang-Chuan ;
Chuang, Eric Y. .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2014, 9 (02) :149-171