Detection of a sparse submatrix of a high-dimensional noisy matrix

被引:74
作者
Butucea, Cristina [1 ,2 ]
Ingster, Yuri I. [3 ]
机构
[1] Univ Paris Est, UPEMLV, CNRS, LAMA,UPEC,UMR 8050, F-77464 Marne La Vallee, France
[2] CREST, F-92240 Malakoff, France
[3] St Petersburg Electrotech Univ, St Petersburg 197376, Russia
关键词
detection of sparse signal; minimax adaptive testing; minimax testing; random matrices; sharp detection bounds;
D O I
10.3150/12-BEJ470
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We observe a N x M matrix Y-ij = s(ij) + xi(ij) with xi(ij) similar to N(0, 1) i.i.d. in i, j, and s(ij) is an element of R. We test the null hypothesis s(ij) = 0 for all i, j against the alternative that there exists some submatrix of size n x m with significant elements in the sense that s(ij) >= a > 0. We propose a test procedure and compute the asymptotical detection boundary a so that the maximal testing risk tends to 0 as M -> infinity, N -> infinity, p = n/N -> 0, q = m/M -> 0. We prove that this boundary is asymptotically sharp minimax under some additional constraints. Relations with other testing problems are discussed. We propose a testing procedure which adapts to unknown (n, m) within some given set and compute the adaptive sharp rates. The implementation of our test procedure on synthetic data shows excellent behavior for sparse, not necessarily squared matrices. We extend our sharp minimax results in different directions: first, to Gaussian matrices with unknown variance, next, to matrices of random variables having a distribution from an exponential family (non-Gaussian) and, finally, to a two-sided alternative for matrices with Gaussian elements.
引用
收藏
页码:2652 / 2688
页数:37
相关论文
共 11 条
[1]   ON COMBINATORIAL TESTING PROBLEMS [J].
Addario-Berry, Louigi ;
Broutin, Nicolas ;
Devroye, Luc ;
Lugosi, Gabor .
ANNALS OF STATISTICS, 2010, 38 (05) :3063-3092
[2]   Near-optimal detection of geometric objects by fast multiscale methods [J].
Arias-Castro, E ;
Donoho, DL ;
Huo, XM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (07) :2402-2425
[3]  
Arias-Castro E., 2010, GLOBAL TESTING SPARS
[4]   DETECTION OF AN ANOMALOUS CLUSTER IN A NETWORK [J].
Arias-Castro, Ery ;
Candes, Emmanuel J. ;
Durand, Arnaud .
ANNALS OF STATISTICS, 2011, 39 (01) :278-304
[5]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[6]   Higher criticism for detecting sparse heterogeneous mixtures [J].
Donoho, D ;
Jin, JS .
ANNALS OF STATISTICS, 2004, 32 (03) :962-994
[7]  
Ingster Y.I., 2002, J MATH SCI, V294, p[88, 1723]
[8]  
Ingster Y. I., 1997, MATH METHODS STAT, V6, P47, DOI 10.20347/WIAS.PREPRINT.215
[9]   NUCLEAR-NORM PENALIZATION AND OPTIMAL RATES FOR NOISY LOW-RANK MATRIX COMPLETION [J].
Koltchinskii, Vladimir ;
Lounici, Karim ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2011, 39 (05) :2302-2329
[10]   FINDING LARGE AVERAGE SUBMATRICES IN HIGH DIMENSIONAL DATA [J].
Shabalin, Andrey A. ;
Weigman, Victor J. ;
Perou, Charles M. ;
Nobel, Andrew B. .
ANNALS OF APPLIED STATISTICS, 2009, 3 (03) :985-1012