SHARP VARIABLE SELECTION OF A SPARSE SUBMATRIX IN A HIGH-DIMENSIONAL NOISY MATRIX

被引:14
作者
Butucea, Cristina [1 ,2 ]
Ingster, Yuri I.
Suslina, Irina A. [3 ]
机构
[1] Univ Paris Est, CNRS, UPEMLV, LAMA,UMR 8050,UPEC, F-77454 Marne La Vallee, France
[2] CREST, F-92240 Malakoff, France
[3] St Petersburg Natl Res Univ Informat Technol Mech, St Petersburg 197101, Russia
关键词
Estimation; minimax testing; large matrices; selection of sparse signal; sharp selection bounds; variable selection; LARGE-AVERAGE;
D O I
10.1051/ps/2014017
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We observe a N x M matrix of independent, identically distributed Gaussian random variables which are centered except for elements of some submatrix of size n x m where the mean is larger than some a > 0. The submatrix is sparse in the sense that n/N and m/M tend to 0, whereas n, m, N and M tend to infinity. We consider the problem of selecting the random variables with significantly large mean values, as was also considered by [M. Kolar, S. Balakrishnan, A Rinaldo and A. Singh, NIPS (2011)]. We give sufficient conditions on a as a function of n, m, N and M and construct a uniformly consistent procedure in order to do sharp variable selection. We also prove the minimax lower bounds under necessary conditions which are complementary to the previous conditions. The critical values a* separating the necessary and sufficient conditions are sharp (we show exact constants), whereas [M. Kolar, S. Balakrishnan, A. Rinaldo and A. Singh, NIPS (2011)] only prove rate optimality and focus on suboptimal computationally feasible selectors. Note that rate optimality in this problem leaves out a large set of possible parameters, where we do not know whether consistent selection is possible.
引用
收藏
页码:115 / 134
页数:20
相关论文
共 32 条
  • [1] Adapting to unknown sparsity by controlling the false discovery rate
    Abramovich, Felix
    Benjamini, Yoav
    Donoho, David L.
    Johnstone, Iain M.
    [J]. ANNALS OF STATISTICS, 2006, 34 (02) : 584 - 653
  • [2] [Anonymous], 2003, LECT NOTES STAT
  • [3] Near-optimal detection of geometric objects by fast multiscale methods
    Arias-Castro, E
    Donoho, DL
    Huo, XM
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (07) : 2402 - 2425
  • [4] ARIAS-CASTRO E., 2010, ARXIV10071434
  • [5] Arias-Castro E., 2012, ARXIV12082635
  • [6] DETECTION OF AN ANOMALOUS CLUSTER IN A NETWORK
    Arias-Castro, Ery
    Candes, Emmanuel J.
    Durand, Arnaud
    [J]. ANNALS OF STATISTICS, 2011, 39 (01) : 278 - 304
  • [7] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [8] Selection of variables and dimension reduction in high-dimensional non-parametric regression
    Bertin, Karine
    Lecue, Guillaume
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2008, 2 : 1224 - 1241
  • [9] SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR
    Bickel, Peter J.
    Ritov, Ya'acov
    Tsybakov, Alexandre B.
    [J]. ANNALS OF STATISTICS, 2009, 37 (04) : 1705 - 1732
  • [10] Butucea C., 2013, ARXIV13014660