Robust Parallel Pursuit for Large-Scale Association Network Learning

被引:0
作者
Li, Wenhui [1 ]
Zhou, Xin [1 ]
Dong, Ruipeng [1 ]
Zheng, Zemin [1 ]
机构
[1] Univ Sci & Technol China, Sch Management, Int Inst Finance, Hefei 230026, Anhui, Peoples R China
基金
国家重点研发计划; 中国博士后科学基金;
关键词
large-scale association network; outlier detection; robust estimation; sparse reduced-rank regression; scalability; parallel pursuit; ISOPRENOID BIOSYNTHESIS; REGRESSION; SELECTION; COMPLEXITY; MODELS; LASSO;
D O I
10.1287/ijoc.2022.0181
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Sparse reduced -rank regression is an important tool to uncover the largescale response -predictor association network, as exemplified by modern applications such as the diffusion networks, and recommendation systems. However, the association networks recovered by existing methods are either sensitive to outliers or not scalable under the big data setup. In this paper, we propose a new statistical learning method called robust parallel pursuit (ROP) for joint estimation and outlier detection in large-scale response -predictor association network analysis. The proposed method is scalable in that it transforms the original large-scale network learning problem into a set of sparse unit -rank estimations via factor analysis, thus facilitating an effective parallel pursuit algorithm. Furthermore, we provide comprehensive theoretical guarantees including consistency in parameter estimation, rank selection, and outlier detection, and we conduct an inference procedure to quantify the uncertainty of existence of outliers. Extensive simulation studies and two real -data analyses demonstrate the effectiveness and the scalability of the suggested approach.
引用
收藏
页码:428 / 445
页数:19
相关论文
共 44 条
[1]  
[Anonymous], 2011, Robust statistics
[2]  
Bahadori M.T., 2013, P 2013 SIAM INT C DA, P467, DOI [DOI 10.1137/1.9781611972832.52, 10.1137/1.9781611972832.52]
[3]   Inferential theory for factor models of large dimensions. [J].
Bai, J .
ECONOMETRICA, 2003, 71 (01) :135-171
[4]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[5]   The landscape of genetic complexity across 5,700 gene expression traits in yeast [J].
Brem, RB ;
Kruglyak, L .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (05) :1572-1577
[6]  
Bruna J., 2014, INT C LEARNING REPRE
[7]   JOINT VARIABLE AND RANK SELECTION FOR PARSIMONIOUS ESTIMATION OF HIGH-DIMENSIONAL MATRICES [J].
Bunea, Florentina ;
She, Yiyuan ;
Wegkamp, Marten H. .
ANNALS OF STATISTICS, 2012, 40 (05) :2359-2388
[8]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[9]  
Carroll R., 2006, MEASUREMENT ERROR NO
[10]   Reduced rank stochastic regression with a sparse singular value decomposition [J].
Chen, Kun ;
Chan, Kung-Sik ;
Stenseth, Nils Chr. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2012, 74 :203-221