Maximum Likelihood Estimation for Learning Populations of Parameters

被引:0
|
作者
Vinayak, Ramya Korlakai [1 ]
Kong, Weihao [2 ]
Valiant, Gregory [2 ]
Kakade, Sham [1 ]
机构
[1] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
[2] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
ENVIRONMENTAL HETEROGENEITY; ENTROPY; UNSEEN; NUMBER; DISTRIBUTIONS; MIXTURES; SAMPLE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Consider a setting with N independent individuals, each with an unknown parameter, p(i) is an element of [0, 1] drawn from some unknown distribution P*. After observing the outcomes of t independent Bernoulli trials, i.e., X-i similar to Binomial(t, p(i)) per individual, our objective is to accurately estimate P*. This problem arises in numerous domains, including the social sciences, psychology, healthcare, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where t << N, the maximum likelihood estimator (MLE) is both statistically minimax optimal and efficiently computable. Precisely, for sufficiently large N, the MLE achieves the information theoretic optimal error bound of O(1/t) for t < c log N, with regards to the earth mover's distance (between the estimated and true distributions). More generally, in an exponentially large interval of t beyond c log N, the MLE achieves the minimax error bound of O(1/root t log N). In contrast, regardless , of how large N is, the naive "plug-in" estima tor for this problem only achieves the sub-optimal error of Theta(1/root t).
引用
收藏
页数:10
相关论文
共 50 条