On divergences and informations in statistics and information theory

被引:385
作者
Liese, Friedrich [1 ]
Vajda, Igor
机构
[1] Univ Rostock, Dept Math, D-18051 Rostock, Germany
[2] Acad Sci Czech Republ, Inst Informat Theory & Automat, Prague 18208, Czech Republic
关键词
Arimoto divergence; Arimoto entropy; Arimoto information; deficiency; discrimination information; f-divergence; minimum f-divergence estimators; minimum f-divergence tests; Shannon divergence; Shannon information; statistical information; sufficiency;
D O I
10.1109/TIT.2006.881731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper deals with the f-divergences of Csiszar generalizing the discrimination information of Kullback, the total variation distance, the Hellinger divergence, and the Pearson divergence. All basic properties of f-divergences including relations to the decision errors are proved in a new manner replacing the classical Jensen inequality by a new generalized Taylor expansion of convex functions. Some new properties are proved too, e.g., relations; to the statistical sufficiency and deficiency. The generalized Taylor expansion also shows very easily that all f-divergences are average statistical informations (differences between prior and posterior Bayes errors) mutually differing only in the weights imposed on various prior distributions. The statistical information introduced by De Groot and the classical information of Shannon are shown to be extremal cases corresponding to alpha = 0 and a = 1 in the class of the so-called Arimoto alpha-informations introduced in this paper for 0 < alpha < 1 by means of the Arimoto alpha-entropies. Some new examples of f-divergences are introduced as well, namely, the Shannon divergences and the Arimoto alpha-divergences leading for alpha up arrow 1 to the Shannon divergences. Square roots of all these divergences are shown to be metrics satisfying the triangle inequality. The last section introduces statistical tests and estimators based on the minimal f-divergence with the empirical distribution achieved in the families of hypothetic distributions. For the Kullback divergence this leads to the classical likelihood ratio test and estimator.
引用
收藏
页码:4394 / 4412
页数:19
相关论文
共 54 条
[1]  
ALI SM, 1966, J ROY STAT SOC B, V28, P131
[2]   INFORMATION-THEORETICAL CONSIDERATIONS ON ESTIMATION PROBLEMS [J].
ARIMOTO, S .
INFORMATION AND CONTROL, 1971, 19 (03) :181-&
[3]   DISTRIBUTION ESTIMATION CONSISTENT IN TOTAL VARIATION AND IN 2 TYPES OF INFORMATION DIVERGENCE [J].
BARRON, AR ;
GYORFI, L ;
VANDERMEULEN, EC .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (05) :1437-1454
[4]   About the asymptotic accuracy of Barron density estimates [J].
Berlinet, A ;
Vajda, I ;
van der Meulen, EC .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (03) :999-1009
[5]  
Bhattacharyya A, 1946, SANKHYA, V8, P1
[6]  
Blahut R.E., 1987, Principles and Practice of Information Theory
[7]   SPEECH CODING BASED UPON VECTOR QUANTIZATION [J].
BUZO, A ;
GRAY, AH ;
GRAY, RM ;
MARKEL, JD .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (05) :562-574
[8]   A MEASURE OF ASYMPTOTIC EFFICIENCY FOR TESTS OF A HYPOTHESIS BASED ON THE SUM OF OBSERVATIONS [J].
CHERNOFF, H .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (04) :493-507
[9]   INFORMATION-THEORETIC ASYMPTOTICS OF BAYES METHODS [J].
CLARKE, BS ;
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (03) :453-471
[10]  
Cover TM, 2006, Elements of Information Theory