Self-consistent method for density estimation

被引:33
作者
Bernacchia, Alberto [1 ]
Pigolotti, Simone [2 ]
机构
[1] Yale Univ, Dept Neurobiol, New Haven, CT 06510 USA
[2] Niels Bohr Inst, DK-2100 Copenhagen, Denmark
关键词
Binning; Kernel density estimation; Non-parametric statistics; PROBABILITY; DISTRIBUTIONS; FIELD; BIAS;
D O I
10.1111/j.1467-9868.2011.00772.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The estimation of a density profile from experimental data points is a challenging problem, which is usually tackled by plotting a histogram. Prior assumptions on the nature of the density, from its smoothness to the specification of its form, allow the design of more accurate estimation procedures, such as maximum likelihood. Our aim is to construct a procedure that makes no explicit assumptions, but still providing an accurate estimate of the density. We introduce the self-consistent estimate: the power spectrum of a candidate density is given, and an estimation procedure is constructed on the assumption, to be released a posteriori, that the candidate is correct. The self-consistent estimate is defined as a prior candidate density that precisely reproduces itself. Our main result is to derive the exact expression of the self-consistent estimate for any given data set, and to study its properties. Applications of the method require neither priors on the form of the density nor the subjective choice of parameters. A cutoff frequency, akin to a bin size or a kernel bandwidth, emerges naturally from the derivation. We apply the self-consistent estimate to artificial data generated from various distributions and show that it reaches the theoretical limit for the scaling of the square error with the size of the data set.
引用
收藏
页码:407 / 422
页数:16
相关论文
共 35 条
  • [1] [Anonymous], 1994, Kernel smoothing
  • [2] HIERARCHIES OF HIGHER-ORDER KERNELS
    BERLINET, A
    [J]. PROBABILITY THEORY AND RELATED FIELDS, 1993, 94 (04) : 489 - 504
  • [3] Field theories for learning probability distributions
    Bialek, W
    Callan, CG
    Strong, SP
    [J]. PHYSICAL REVIEW LETTERS, 1996, 77 (23) : 4693 - 4697
  • [4] Binder K., 1986, Monte Carlo Methods in Statistical Physics. Topics in Current Physics
  • [5] BOWMAN AW, 1984, BIOMETRIKA, V71, P353
  • [6] Dependence structures for multivariate high-frequency data in finance
    Breymann, W
    Dias, A
    Embrechts, P
    [J]. QUANTITATIVE FINANCE, 2003, 3 (01) : 1 - 14
  • [7] Multiple neural spike train data analysis: state-of-the-art and future challenges
    Brown, EN
    Kass, RE
    Mitra, PP
    [J]. NATURE NEUROSCIENCE, 2004, 7 (05) : 456 - 461
  • [8] Power-Law Distributions in Empirical Data
    Clauset, Aaron
    Shalizi, Cosma Rohilla
    Newman, M. E. J.
    [J]. SIAM REVIEW, 2009, 51 (04) : 661 - 703
  • [9] CSORGO S, 1983, ACTA SCI MATH, V45, P141
  • [10] MEAN INTEGRATED SQUARE ERROR PROPERTIES OF DENSITY ESTIMATES
    DAVIS, KB
    [J]. ANNALS OF STATISTICS, 1977, 5 (03) : 530 - 535