Always good Turing: Asymptotically optimal probability estimation

被引:80
作者
Orlitsky, A [1 ]
Santhanam, NP
Zhang, JN
机构
[1] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
关键词
D O I
10.1126/science.1088284
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
While deciphering the Enigma code, Good and Turing derived an unintuitive, yet effective, formula for estimating a probability distribution from a sample of data. We de. ne the attenuation of a probability estimator as the largest possible ratio between the per-symbol probability assigned to an arbitrarily long sequence by any distribution, and the corresponding probability assigned by the estimator. We show that some common estimators have infinite attenuation and that the attenuation of the Good-Turing estimator is low, yet greater than 1. We then derive an estimator whose attenuation is 1; that is, asymptotically it does not underestimate the probability of any sequence.
引用
收藏
页码:427 / 431
页数:5
相关论文
共 31 条
[1]  
ABERG J, 1997, P COMPR COMPL SEQ
[2]  
[Anonymous], 1995, PHILOS ESSAYS PROBAB
[3]  
[Anonymous], 1993, CODEBREAKERS INSIDE
[4]  
Cesa-Bianchi N., 1999, Proceedings of the Twelfth Annual Conference on Computational Learning Theory, P12, DOI 10.1145/307400.307407
[5]  
Chen S. F., 1996, Proceedings of the 1996 Association for Computational Linguistics ACL, P310, DOI DOI 10.3115/981863.981904
[6]  
Church K. W., 1991, Stat. Comput, V1, P93, DOI DOI 10.1007/BF01889984
[7]   JEFFREYS PRIOR IS ASYMPTOTICALLY LEAST FAVORABLE UNDER ENTROPY RISK [J].
CLARKE, BS ;
BARRON, AR .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1994, 41 (01) :37-60
[8]  
Cover TM, 1991, Mathematical Finance, V1, P1, DOI [https://doi.org/10.1111/j.1467-9965.1991.tb00002.x, DOI 10.1111/J.1467-9965.1991.TB00002.X]
[9]   Redundancy rates for renewal and other processes [J].
Csiszar, I ;
Shields, PC .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (06) :2065-2072
[10]   UNIVERSAL NOISELESS CODING [J].
DAVISSON, LD .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1973, 19 (06) :783-795