Asymptotic minimax regret for data compression, gambling, and prediction

被引:117
作者
Xie, Q [1 ]
Barron, AR
机构
[1] GE Capital, Stamford, CT 06927 USA
[2] Yale Univ, Dept Stat, New Haven, CT 06520 USA
基金
美国国家科学基金会;
关键词
Jeffreys' prior; minimax redundancy; minimax regret; universal coding; universal prediction;
D O I
10.1109/18.825803
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For problems of data compression, gambling, and prediction of individual sequences x(1),...,x(n) the following questions arise. Given a target family of probability mass functions p(x(1),...,x(n)\theta), how do we choose a probability mass function p(x(1),...,x(n)) so that it approximately minimizes the maximum regret /belowdisplayskip10ptminus6pt max(x1,...,xn)(log1/q(x(1),...,x(n))-log1/p(x(1),...,x(n)\<(theta)over cap>) and so that it achieves the best constant C in the asymptotics of the minimax regret, which is of the form (d/2)log(n/2 pi)+C+o(1), where d is the parameter dimension? Are there easily implementable strategies q that achieve those asymptotics? And how does the solution to the worst case sequence problem relate to the solution to the corresponding expectation version min(q)max(theta)E(theta)(log 1/q(x(1),...,x(n))-log1/p(x(1),...,x(n)/theta))? In the discrete memoryless case, with a given alphabet of size m, the Bayes procedure with the Dirichlet(1/2,...,1/2) prior is asymptotically maximin. Simple modifications of It are shown to be asymptotically minimax. The best constant is C-m=log(Gamma(1/2)(m)/(Gamma(m/2)) which agrees with the logarithm of the integral of the square root of the determinant of the Fisher information. Moreover, our asymptotically optimal strategies for the worst case problem are also asymptotically optimal for the expectation version. Analogous conclusions are given for the case of prediction, gambling, and compression when, for each observation, one has access to side information from an alphabet of size k, In this setting the minimax regret is shown to be k(m-1)/2logn/2 pi k+kC(m)+o(1).
引用
收藏
页码:431 / 445
页数:15
相关论文
共 39 条
  • [1] The minimum description length principle in coding and modeling
    Barron, A
    Rissanen, J
    Yu, B
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2743 - 2760
  • [2] BARRON A, 1987, 7 U ILL DEP STAT
  • [3] Barron A. R., 1985, THESIS STANFORD U ST
  • [4] Barron Andrew R., 1987, Open problems in communication and computation, P85
  • [5] INFORMATION-THEORETIC ASYMPTOTICS OF BAYES METHODS
    CLARKE, BS
    BARRON, AR
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (03) : 453 - 471
  • [6] JEFFREYS PRIOR IS ASYMPTOTICALLY LEAST FAVORABLE UNDER ENTROPY RISK
    CLARKE, BS
    BARRON, AR
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1994, 41 (01) : 37 - 60
  • [7] Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
  • [8] Universal portfolios with side information
    Cover, TM
    Ordentlich, E
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (02) : 348 - 363
  • [9] EFFICIENT UNIVERSAL NOISELESS SOURCE CODES
    DAVISSON, LD
    MCELIECE, RJ
    PURSLEY, MB
    WALLACE, MS
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1981, 27 (03) : 269 - 279
  • [10] A SOURCE MATCHING APPROACH TO FINDING MINIMAX CODES
    DAVISSON, LD
    LEONGARCIA, A
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1980, 26 (02) : 166 - 174