Estimating the Major Cluster by Mean-Shift with Updating Kernel

被引:1
作者
Tian, Ye [1 ]
Yokota, Yasunari [2 ]
机构
[1] Gifu Univ, Grad Sch Engn, 1-1 Yanagido, Gifu 5011193, Japan
[2] Gifu Univ, Dept EECE, Fac Engn, 1-1 Yanagido, Gifu 5011193, Japan
关键词
kernel bandwidth and shape; mean-shift; major cluster; mode estimation; updating kernel; BANDWIDTH SELECTION; MAXIMUM-LIKELIHOOD; SPACE;
D O I
10.3390/math7090771
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The mean-shift method is a convenient mode-seeking method. Using a principle of the sample mean over an analysis window, or kernel, in a data space where samples are distributed with bias toward the densest direction of sample from the kernel center, the mean-shift method is an attempt to seek the densest point of samples, or the sample mode, iteratively. A smaller kernel leads to convergence to a local mode that appears because of statistical fluctuation. A larger kernel leads to estimation of a biased mode affected by other clusters, abnormal values, or outliers if they exist other than in the major cluster. Therefore, optimal selection of the kernel size, which is designated as the bandwidth in many reports of the literature, represents an important problem. As described herein, assuming that the major cluster follows a Gaussian probability density distribution, and, assuming that the outliers do not affect the sample mode of the major cluster, and, by adopting a Gaussian kernel, we propose a new mean-shift by which both the mean vector and covariance matrix of the major cluster are estimated in each iteration. Subsequently, the kernel size and shape are updated adaptively. Numerical experiments indicate that the mean vector, covariance matrix, and the number of samples of the major cluster can be estimated stably. Because the kernel shape can be adjusted not only to an isotropic shape but also to an anisotropic shape according to the sample distribution, the proposed method has higher estimation precision than the general mean-shift.
引用
收藏
页数:25
相关论文
共 30 条
[1]  
[Anonymous], IEICE TECH REP
[2]  
[Anonymous], P 1991 IEEE COMP SOC
[3]  
[Anonymous], IEICE T FUNDAM
[4]   Adaptive Seeding for Gaussian Mixture Models [J].
Bloemer, Johannes ;
Bujna, Kathrin .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 :296-308
[5]   Optimal Bandwidth Selection for Kernel Density Functionals Estimation [J].
Chen, Su .
JOURNAL OF PROBABILITY AND STATISTICS, 2015, 2015
[6]   MEAN SHIFT, MODE SEEKING, AND CLUSTERING [J].
CHENG, YZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (08) :790-799
[7]   Mean shift: A robust approach toward feature space analysis [J].
Comaniciu, D ;
Meer, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (05) :603-619
[8]   An algorithm for data-driven bandwidth selection [J].
Comaniciu, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (02) :281-288
[9]  
Comaniciu D., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P1197, DOI 10.1109/ICCV.1999.790416
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38