A data driven procedure for density estimation with some applications

被引:6
作者
Chaudhuri, D
Chaudhuri, BB
Murthy, CA
机构
[1] INDIAN STAT INST,COMP VIS & PATTERN RECOGNIT UNIT,CALCUTTA 700035,W BENGAL,INDIA
[2] INDIAN STAT INST,MACHINE INTELLIGENCE UNIT,CALCUTTA 700035,W BENGAL,INDIA
关键词
probability density estimation; kernel method; window selection; minimal spanning tree; bounded set; asymptotically unbiased estimator; representative point;
D O I
10.1016/0031-3203(96)00028-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper deals with the probability density estimation using a kernel-based approach where the window size of the kernel is found by a data-driven procedure. It is theoretically shown that, under certain assumptions, the estimated densities on bounded sets can be asymptotically unbiased when the width of window is obtained from the minimal spanning tree of the observed data The theoretical development initially carried out on R(2) is applicable to higher dimensional spaces. The results are experimentally verified on bounded sets with different types of distributions. The behaviour of the estimator in the case of the unbounded set as in that for Gaussian density is also experimentally seen to be good. Some applications of the proposed density estimation technique is demonstrated. One application is the representative point detection algorithm, which can be applied for data reduction and outlier rejection. Another application involves detection of border points of a dot pattern as well as finding a thinned version of the dot pattern. Copyright (C) 1996 Pattern Recognition Society.
引用
收藏
页码:1719 / 1736
页数:18
相关论文
共 35 条
[1]  
[Anonymous], PROBABILITY MEASURE
[2]  
[Anonymous], P CAMB PHILO SOC, DOI DOI 10.1017/S0305004100034095
[3]  
[Anonymous], ANN I STAT MATH
[4]  
BONEVA LI, 1971, J ROY STAT SOC B, V33, P1
[5]  
Cencov N. N., 1962, SOV MATH, V3, P1559
[6]  
CHAUDHURI BB, TRKBCS293 IND STAT I
[7]   FINDING A SUBSET OF REPRESENTATIVE POINTS IN A DATA SET [J].
CHAUDHURI, D ;
MURTHY, CA ;
CHAUDHURI, BB .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1994, 24 (09) :1416-1424
[8]  
CHAUDHURI D, 1994, THESIS ISI CALCUTTA
[9]   LAWS OF THE ITERATED LOGARITHM FOR ORDER-STATISTICS OF UNIFORM SPACINGS [J].
DEVROYE, L .
ANNALS OF PROBABILITY, 1981, 9 (05) :860-867
[10]  
DUIN RPW, 1976, IEEE T COMPUT, V25, P1175, DOI 10.1109/TC.1976.1674577