Large Covariance Estimation from Streaming Data with Knowledge-Based Sketch Matrix

被引:0
作者
Tan, Xiao [1 ]
Wang, Zhaoyang [1 ]
Wang, Meng [1 ]
Shen, Dian [1 ]
Chen, Weitong [2 ]
Wang, Beilun [1 ]
机构
[1] Southeast Univ, Nanjing, Peoples R China
[2] Adelaide Univ, Adelaide, SA 5005, Australia
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 5 | 2024年 / 14854卷
关键词
Covariance Matrix; Streaming Data; Sketching Algorithm;
D O I
10.1007/978-981-97-5569-1_32
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Covariance matrix estimation is an important problem in statistics, with wide applications in finance, neuroscience, meteorology, oceanography, and other fields. However, when the data are high-dimensional and constantly generated and updated in a streaming fashion, the covariance matrix estimation faces huge challenges, including the curse of dimensionality and limited memory space. The existing methods either assume sparsity, ignoring any possible common factor among the variables, or obtain poor performance in recovering the covariance matrix directly from sketched data. To address these issues, we propose a novel method - KEEF: Knowledge-based Time and Memory Efficient Covariance Estimator in Factor Model. Our method leverages historical data to train a knowledge-based sketch matrix, which is used to accelerate the factor analysis of streaming data and directly estimates the covariance matrix from the sketched data. We provide theoretical guarantees, showing the advantages of our method in terms of time and space complexity, as well as accuracy. We conduct extensive experiments on synthetic and real-world data, comparing KEEF with several state-of-the-art methods, demonstrating the superior performance of our method.
引用
收藏
页码:493 / 502
页数:10
相关论文
共 23 条
[1]  
Bickel P.J., 2008, Covariance regularization by thresholding
[2]   GLS Estimation of Dynamic Factor Models [J].
Breitung, Joerg ;
Tenhofen, Joern .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (495) :1150-1166
[3]  
Cai T. T., 2016, ESTIMATING STRUCTURE
[4]   Dynamic correlation analysis of financial contagion: Evidence from Asian markets [J].
Chiang, Thomas C. ;
Jeon, Bang Nam ;
Li, Huimin .
JOURNAL OF INTERNATIONAL MONEY AND FINANCE, 2007, 26 (07) :1206-1228
[5]  
Clarkson KL, 2009, ACM S THEORY COMPUT, P205
[6]  
Dasarathy G, 2013, Arxiv, DOI arXiv:1303.6544
[7]   Sketching Sparse Matrices, Covariances, and Graphs via Tensor Products [J].
Dasarathy, Gautam ;
Shah, Parikshit ;
Bhaskar, Badri Narayan ;
Nowak, Robert D. .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (03) :1373-1388
[8]  
Dasarathy G, 2012, ANN ALLERTON CONF, P1026, DOI 10.1109/Allerton.2012.6483331
[9]  
El Karoui N., 2010, High-dimensionality effects in the markowitz problem and other quadratic programs with linear constraints: Risk underestimation
[10]   An overview of the estimation of large covariance and precision matrices [J].
Fan, Jianqing ;
Liao, Yuan ;
Liu, Han .
ECONOMETRICS JOURNAL, 2016, 19 (01) :C1-C32