Optimal detection of heterogeneous and heteroscedastic mixtures

被引:76
作者
Cai, T. Tony
Jeng, X. Jessie [1 ]
Jin, Jiashun [2 ]
机构
[1] Univ Penn, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Detection boundary; Higher criticism; Likelihood ratio test; Optimal adaptivity; Sparsity; HIGH-DIMENSIONAL DATA; HIGHER CRITICISM; FEATURE-SELECTION; ORACLE;
D O I
10.1111/j.1467-9868.2011.00778.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The problem of detecting heterogeneous and heteroscedastic Gaussian mixtures is considered. The focus is on how the parameters of heterogeneity, heteroscedasticity and proportion of non-null component influence the difficulty of the problem. We establish an explicit detection boundary which separates the detectable region where the likelihood ratio test is shown to detect the presence of non-null effects reliably from the undetectable region where no method can do so. In particular, the results show that the detection boundary changes dramatically when the proportion of non-null component shifts from the sparse regime to the dense regime. Furthermore, it is shown that the higher criticism test, which does not require specific information on model parameters, is optimally adaptive to the unknown degrees of heterogeneity and heteroscedasticity in both the sparse and the dense cases.
引用
收藏
页码:629 / 662
页数:34
相关论文
共 25 条
[1]  
[Anonymous], 1998, Mathematical Methods in Statistics
[2]   ON A PROBLEM OF SIGNAL-DETECTION LEADING TO STABLE-DISTRIBUTIONS [J].
BURNASHEV, MV ;
BEGMATOV, IA .
THEORY OF PROBABILITY AND ITS APPLICATIONS, 1990, 35 (03) :556-560
[3]   Estimation and confidence sets for sparse normal mixtures [J].
Cai, T. Tony ;
Jin, Jiashun ;
Low, Mark G. .
ANNALS OF STATISTICS, 2007, 35 (06) :2421-2449
[4]   Higher criticism statistic:: detecting and identifying non-Gaussianity in the WMAP first-year data [J].
Cayón, L ;
Jin, J ;
Treaster, A .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2005, 362 (03) :826-832
[5]   Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic [J].
Delaigle, Aurore ;
Hall, Peter ;
Jin, Jiashun .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 :283-301
[6]   Higher criticism for detecting sparse heterogeneous mixtures [J].
Donoho, D ;
Jin, JS .
ANNALS OF STATISTICS, 2004, 32 (03) :962-994
[7]   Higher criticism thresholding: Optimal feature selection when useful features are rare and weak [J].
Donoho, David ;
Jin, Jiashun .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (39) :14790-14795
[8]   Feature selection by higher criticism thresholding achieves the optimal phase diagram [J].
Donoho, David ;
Jin, Jiashun .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906) :4449-4470
[9]   Properties of higher criticism under strong dependence [J].
Hall, Peter ;
Jin, Jiashun .
ANNALS OF STATISTICS, 2008, 36 (01) :381-402
[10]   Theoretical measures of relative performance of classifiers for high dimensional data with small sample sizes [J].
Hall, Peter ;
Pittelkow, Yvonne ;
Ghosh, Malay .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :159-173