SOFAR: Large-Scale Association Network Learning

被引:0
作者
Uematsu, Yoshimasa [1 ,2 ]
Fan, Yingying [3 ]
Chen, Kun [4 ]
Lv, Jinchi [3 ]
Lin, Wei [5 ,6 ]
机构
[1] USC Marshall, Los Angeles, CA 90089 USA
[2] Tohoku Univ, Dept Econ & Management, Sendai, Miyagi 9808576, Japan
[3] Univ Southern Calif, Marshall Sch Business, Data Sci & Operat Dept, Los Angeles, CA 90089 USA
[4] Univ Connecticut, Dept Stat, Storrs, CT 06269 USA
[5] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[6] Peking Univ, Ctr Stat Sci, Beijing 100871, Peoples R China
基金
国家重点研发计划;
关键词
Big data; large-scale association network; simultaneous response and predictor selection; latent factors; sparse singular value decomposition; orthogonality constrained optimization; nonconvex statistical learning; REDUCED-RANK REGRESSION; NONCONCAVE PENALIZED LIKELIHOOD; PRINCIPAL COMPONENT ANALYSIS; FACTOR MODELS; DIMENSION REDUCTION; VARIABLE SELECTION; MATRIX ESTIMATION; SPARSE PCA; ESTIMATORS; ALGORITHMS;
D O I
10.1109/TIT.2019.2909889
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many modern big data applications feature large scale in both numbers of responses and predictors. Better statistical efficiency and scientific insights can be enabled by understanding the large-scale response-predictor association network structures via layers of sparse latent factors ranked by importance. Yet sparsity and orthogonality have been two largely incompatible goals. To accommodate both features, in this paper, we suggest the method of sparse orthogonal factor regression (SOFAR) via the sparse singular value decomposition with orthogonality constrained optimization to learn the underlying association networks, with broad applications to both unsupervised and supervised learning tasks, such as biclustering with sparse singular value decomposition, sparse principal component analysis, sparse factor analysis, and spare vector autoregression analysis. Exploiting the framework of convexity-assisted nonconvex optimization, we derive nonasymptotic error bounds for the suggested procedure characterizing the theoretical advantages. The statistical guarantees are powered by an efficient SOFAR algorithm with convergence property. Both computational and theoretical advantages of our procedure are demonstrated with several simulations and real data examples.
引用
收藏
页码:4924 / 4939
页数:16
相关论文
共 72 条