Improved Detection of Correlated Signals in Low-Rank-Plus-Noise Type Data Sets Using Informative Canonical Correlation Analysis (ICCA)

被引:13
作者
Asendorf, Nicholas [1 ]
Nadakuditi, Raj Rao [1 ]
机构
[1] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48019 USA
关键词
Canonical correlation analysis; detection algorithms; random matrix theory; ASYMPTOTIC EXPANSIONS; MATRIX; RECOVERY; DISTRIBUTIONS; EIGENVALUE; ALGORITHM;
D O I
10.1109/TIT.2017.2695601
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider two matrix-valued data sets that are modeled as low-rank-correlated-signal-plus-Gaussian noise. When empirical canonical correlation analysis (CCA) is used to infer these latent correlations, there is a broad regime, where this inference will fail, which was classified by Bao and collaborators in the limit of high dimensionality and sample size. This regime includes the setting, previously considered by Pezeshki and collaborators, where the sample size is less than the combined dimensionality of the data sets. We revisit this detection problem by first observing that the empirically estimated canonical correlation coefficients are the singular values of the inner products between the right singular vectors of the two data sets. Motivated by random matrix theory insights, we propose an algorithm, which we label informative CCA (ICCA), that infers the presence of latent correlations by considering the singular values of only the informative right singular vectors of each data set. We establish fundamental detection limits for ICCA and show that it dramatically outperforms empirical CCA in broad regimes, where empirical CCA provably fails. We extend our theoretical analysis to the setting, where the data sets have randomly missing data and for more general noise models. Finally, we validate our theoretical results with numerical simulations and a real-world experiment.
引用
收藏
页码:3451 / 3467
页数:17
相关论文
共 47 条
  • [1] [Anonymous], 2011, Advances in Neural Information Processing Systems
  • [2] Arbabshirani M. R., 2010, P 4 INT S COMM CONTR, P1
  • [3] Kernel independent component analysis
    Bach, FR
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) : 1 - 48
  • [4] BAO Z., 2014, CANONICAL CORRELATIO
  • [5] Laplacian eigenmaps for dimensionality reduction and data representation
    Belkin, M
    Niyogi, P
    [J]. NEURAL COMPUTATION, 2003, 15 (06) : 1373 - 1396
  • [6] The singular values and vectors of low rank perturbations of large rectangular random matrices
    Benaych-Georges, Florent
    Nadakuditi, Raj Rao
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 111 : 120 - 135
  • [7] The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices
    Benaych-Georges, Florent
    Nadakuditi, Raj Rao
    [J]. ADVANCES IN MATHEMATICS, 2011, 227 (01) : 494 - 521
  • [8] A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION
    Cai, Jian-Feng
    Candes, Emmanuel J.
    Shen, Zuowei
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) : 1956 - 1982
  • [9] Stable signal recovery from incomplete and inaccurate measurements
    Candes, Emmanuel J.
    Romberg, Justin K.
    Tao, Terence
    [J]. COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2006, 59 (08) : 1207 - 1223
  • [10] Robust Principal Component Analysis?
    Candes, Emmanuel J.
    Li, Xiaodong
    Ma, Yi
    Wright, John
    [J]. JOURNAL OF THE ACM, 2011, 58 (03)