A Bernstein-Von Mises Theorem for discrete probability distributions

被引:23
作者
Boucheron, S. [1 ,2 ]
Gassiat, E. [3 ]
机构
[1] CNRS, LPMA, F-75700 Paris, France
[2] Univ Paris Diderot, Paris, France
[3] Univ Paris 11, Paris, France
关键词
Bernstein-Von Mises Theorem; Entropy estimation; non-parametric Bayesian statistics; Discrete models; Concentration inequalities;
D O I
10.1214/08-EJS262
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We investigate the asymptotic normality of the posterior distribution in the discrete setting, when model dimension increases with sample size. We consider a probability mass function theta(0) on N \ {0} and a sequence of truncation levels (k(n))(n) satisfying k(n)(3) <= n inf(i <= kn) theta(0)(i). Let (theta) over cap denote the maximum likelihood estimate of (theta(0)(i))(i <= kn) and let Delta(n)(theta(0)) denote the k(n)-dimensional vector which i-th coordinate is defined by root n((theta) over cap (n)(i)-theta(0)(i)) for 1 <= i <= k(n). We check that under mild conditions on theta(0) and on the sequence of prior probabilities on the k(n)-dimensional simplices, after centering and resealing, the variation distance between the posterior distribution recentered around (theta) over cap (n) and resealed by root n and the k(n)-dimensional Gaussian distribution N(Delta(n)(theta(0)), I(-1)(theta(0))) converges in probability to 0. This theorem can be used to prove the asymptotic normality of Bayesian estimators of Shannon and Renyi entropies. The proofs are based on concentration inequalities for centered and non-centered Chi-square (Pearson) statistics. The latter allow to establish posterior concentration rates with respect to Fisher distance rather than with respect to the Hellinger distance as it is commonplace in non-parametric Bayesian statistics.
引用
收藏
页码:114 / 148
页数:35
相关论文
共 36 条
[1]  
[Anonymous], 1991, ELEMENTS INFORM THEO
[2]  
[Anonymous], 1981, Information Theory: Coding Theorems for Discrete Memoryless Systems
[3]  
[Anonymous], 2004, MATH APPL
[4]  
[Anonymous], 1965, Probability Theory and Related Fields, DOI DOI 10.1007/BF00535479
[5]   Convergence properties of functional estimates for discrete distributions [J].
Antos, A ;
Kontoyiannis, I .
RANDOM STRUCTURES & ALGORITHMS, 2001, 19 (3-4) :163-193
[6]  
BOUCHERON S, 2009, IEEE T INFO IN PRESS, V55
[7]   INFORMATION-THEORETIC ASYMPTOTICS OF BAYES METHODS [J].
CLARKE, BS ;
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (03) :453-471
[8]   JEFFREYS PRIOR IS ASYMPTOTICALLY LEAST FAVORABLE UNDER ENTROPY RISK [J].
CLARKE, BS ;
BARRON, AR .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1994, 41 (01) :37-60
[9]  
Doob JL., 1949, C INT CTR NATL RECHE, V13, P23
[10]  
Dubhashi D, 1998, RANDOM STRUCT ALGOR, V13, P99, DOI 10.1002/(SICI)1098-2418(199809)13:2<99::AID-RSA1>3.0.CO