α-Divergence Is Unique, Belonging to Both f-Divergence and Bregman Divergence Classes

被引:106
作者
Amari, Shun-Ichi [1 ]
机构
[1] Riken Brain Sci Inst, Wako, Saitama 3510198, Japan
关键词
Bregman divergence; canonical divergence; dually flat structure; f-divergence; Fisher information; information geometry; information monotonicity; INFORMATION;
D O I
10.1109/TIT.2009.2030485
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A divergence measure between two probability distributions or positive arrays (positive measures) is a useful tool for solving optimization problems in optimization, signal processing, machine learning, and statistical inference. The Csiszar divergence is a unique class of divergences having information monotonicity, from which the dual alpha geometrical structure with the Fisher metric is derived. The Bregman divergence is another class of divergences that gives a dually flat geometrical structure different from the alpha-structure in general. Csiszar gave an axiomatic characterization of divergences related to inference problems. The Kullback-Leibler divergence is proved to belong to both classes, and this is the only such one in the space of probability distributions. This paper proves that the alpha-divergences constitute a unique class belonging to both classes when the space of positive measures or positive arrays is considered. They are the canonical divergences derived from the dually flat geometrical structure of the space of positive measures.
引用
收藏
页码:4925 / 4931
页数:7
相关论文
共 26 条
  • [1] ALI SM, 1966, J ROY STAT SOC B, V28, P131
  • [2] AMARI S, B POLISH AC IN PRESS
  • [3] Integration of stochastic models by minimizing α-divergence
    Amari, Shun-ichi
    [J]. NEURAL COMPUTATION, 2007, 19 (10) : 2780 - 2796
  • [4] Amari S, 2009, LECT NOTES COMPUT SC, V5416, P75
  • [5] [Anonymous], NEURAL COMPUT
  • [6] Banerjee A, 2005, J MACH LEARN RES, V6, P1705
  • [7] Bregman L. M., 1967, USSR Comput. Math. Math. Phys., V7, P200, DOI [10.1016/0041-5553(67)90040-7, DOI 10.1016/0041-5553(67)90040-7]
  • [8] Chentsov NN, 1982, STAT DECISION RULES
  • [9] A MEASURE OF ASYMPTOTIC EFFICIENCY FOR TESTS OF A HYPOTHESIS BASED ON THE SUM OF OBSERVATIONS
    CHERNOFF, H
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (04): : 493 - 507
  • [10] Cichocki A., 2009, NONNEGATIVE MATRIX T