Factor Model-Based Large Covariance Estimation from Streaming Data Using a Knowledge-Based Sketch Matrix

被引:0
作者
Tan, Xiao [1 ]
Wang, Zhaoyang [1 ]
Qian, Hao [2 ]
Zhou, Jun [2 ]
Duan, Peibo [3 ]
Shen, Dian [1 ]
Wang, Meng [4 ]
Wang, Beilun [1 ]
机构
[1] Southeast Univ, Nanjing, Peoples R China
[2] Ant Grp, Hangzhou, Peoples R China
[3] Monash Univ, Melbourne, Vic, Australia
[4] Tongji Univ, Shanghai, Peoples R China
来源
PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2024 | 2024年
基金
中国国家自然科学基金;
关键词
Covariance Matrix; Streaming Data; Sketching Algorithm; NUMBER;
D O I
10.1145/3627673.3679820
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Covariance matrix estimation is an important problem in statistics, with wide applications in finance, neuroscience, meteorology, oceanography, and other fields. However, when the data are high-dimensional and constantly generated and updated in a streaming fashion, the covariance matrix estimation faces huge challenges, including the curse of dimensionality and limited memory space. The existing methods either assume sparsity, ignoring any possible common factor among the variables, or obtain poor performance in recovering the covariance matrix directly from sketched data. To address these issues, we propose a novel method - KEEF: Knowledge-based Time and Memory Efficient Covariance Estimator in Factor Model and its extended variation. Our method leverages historical data to train a knowledge-based sketch matrix, which is used to accelerate the factor analysis of streaming data and directly estimates the covariance matrix from the sketched data. We provide theoretical guarantees, showing the advantages of our method in terms of time and space complexity, as well as accuracy. We conduct extensive experiments on synthetic and real-world data, comparing KEEF with several state-of-the-art methods, demonstrating the superior performance of our method.
引用
收藏
页码:2210 / 2219
页数:10
相关论文
共 34 条
[1]   Inferential theory for factor models of large dimensions. [J].
Bai, J .
ECONOMETRICA, 2003, 71 (01) :135-171
[2]   Determining the number of factors in approximate factor models [J].
Bai, JS ;
Ng, S .
ECONOMETRICA, 2002, 70 (01) :191-221
[3]  
Bickel P.J., 2008, Covariance regularization by thresholding
[4]   GLS Estimation of Dynamic Factor Models [J].
Breitung, Joerg ;
Tenhofen, Joern .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (495) :1150-1166
[5]  
Cai T. T., 2016, ESTIMATING STRUCTURE
[6]   ARBITRAGE, FACTOR STRUCTURE, AND MEAN-VARIANCE ANALYSIS ON LARGE ASSET MARKETS [J].
CHAMBERLAIN, G ;
ROTHSCHILD, M .
ECONOMETRICA, 1983, 51 (05) :1281-1304
[7]   Dynamic correlation analysis of financial contagion: Evidence from Asian markets [J].
Chiang, Thomas C. ;
Jeon, Bang Nam ;
Li, Huimin .
JOURNAL OF INTERNATIONAL MONEY AND FINANCE, 2007, 26 (07) :1206-1228
[8]  
Clarkson KL, 2009, ACM S THEORY COMPUT, P205
[9]   Sketching Sparse Matrices, Covariances, and Graphs via Tensor Products [J].
Dasarathy, Gautam ;
Shah, Parikshit ;
Bhaskar, Badri Narayan ;
Nowak, Robert D. .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (03) :1373-1388
[10]  
Dasarathy G, 2012, ANN ALLERTON CONF, P1026, DOI 10.1109/Allerton.2012.6483331