Representative cross information potential clustering

被引:5
作者
Araujo, Daniel [1 ,2 ]
Doria Neto, Adriao [2 ]
Martins, Allan [2 ]
机构
[1] Fed Rural Univ Semiarido, Angicos, RN, Brazil
[2] Univ Fed Rio Grande do Norte, Dept Comp Engn & Automat, BR-59072970 Natal, RN, Brazil
关键词
Clustering; Entropy; Information theory; Information potential; Complex data; Image segmentation;
D O I
10.1016/j.patrec.2013.08.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an information-theoretic approach for clustering with a new measure of cross information potential and two clustering algorithms. Instead of using all points of the dataset, the proposed measure uses representative points to quantify the interaction between distributions without any loss of the original properties of cross information potential. This brings a double advantage. It decreases the cost of computing the cross information potential, thus drastically reducing the running time. Secondly, it captures the interaction among the data points by utilizing the underlying statistics of the space region centered around the representative points. With this, we have made it possible to use cross information potential in applications where it was not. We also proposed two algorithms for clustering which explore the idea of creating links between regions of the feature space that are highly correlated. We ran several tests and compared the results with single linkage hierarchical algorithm, finite mixture of Gaussians and spectral clustering in both synthetic and real image segmentation datasets. Experiments showed that our approach achieved better results compared to the other algorithms and it was capable of capture the real structure of the data in most cases regardless of its complexity. It also produced good image segmentation with the advantage of a tuning parameter that provides a way of refine segmentation. (C) 2013 Elsevier B. V. All rights reserved.
引用
收藏
页码:2181 / 2191
页数:11
相关论文
共 15 条
[1]  
[Anonymous], 1986, DENSITY ESTIMATION S
[2]  
Araujo D., 2010, LECT NOTES COMPUTER, V6354, P397
[3]  
CHARIKAR M, 1997, P 29 ANN ACM S THEOR, P626, DOI DOI 10.1145/258533.258657
[4]   Information theoretic clustering [J].
Gokcay, E ;
Principe, JC .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) :158-171
[5]   Pairwise data clustering by deterministic annealing [J].
Hofmann, T ;
Buhmann, JM .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (01) :1-14
[6]  
Jain A. K., 1988, Algorithms for Clustering Data
[7]   Data clustering: 50 years beyond K-means [J].
Jain, Anil K. .
PATTERN RECOGNITION LETTERS, 2010, 31 (08) :651-666
[8]  
Ng AY, 2002, ADV NEUR IN, V14, P849
[9]   ESTIMATION OF A PROBABILITY DENSITY-FUNCTION AND MODE [J].
PARZEN, E .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (03) :1065-&
[10]  
Principe JC, 2010, INFORM SCI STAT, P1, DOI 10.1007/978-1-4419-1570-2