Block-enhanced precision matrix estimation for large-scale datasets

被引:5
作者
Eftekhari, Aryan [1 ]
Pasadakis, Dimosthenis [1 ]
Bollhoefer, Matthias [2 ]
Scheidegger, Simon [3 ]
Schenk, Olaf [1 ]
机构
[1] Univ Svizzera Italiana, Fac Informat, Inst Comp, Lugano, Switzerland
[2] TU Braunschweig, Inst Numer Anal, Braunschweig, Germany
[3] Univ Lausanne, Dept Econ, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
Covariance matrices; Graphical model; Optimization; Gaussian Markov random field; Machine learning application; SPARSE; SELECTION; PARALLEL; SOLVER; MODEL;
D O I
10.1016/j.jocs.2021.101389
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The l(1)-regularized Gaussian maximum likelihood method is a common approach for sparse precision matrix estimation, but one that poses a computational challenge for high-dimensional datasets. We present a novel l(1)-regularized maximum likelihood method for performant large-scale sparse precision matrix estimation utilizing the block structures in the underlying computations. We identify the computational bottlenecks and contribute a block coordinate descent update as well as a block approximate matrix inversion routine, which is then parallelized using a shared-memory scheme. We demonstrate the effectiveness, accuracy, and performance of these algorithms. Our numerical examples and comparative results with various modern open-source packages reveal that these precision matrix estimation methods can accelerate the computation of covariance matrices by two to three orders of magnitude, while keeping memory requirements modest. Furthermore, we conduct large-scale case studies for applications from finance and medicine with several thousand random variables to demonstrate applicability for real-world datasets.
引用
收藏
页数:13
相关论文
共 50 条
[31]   Analytical Performance Estimation for Large-Scale Reconfigurable Dataflow Platforms [J].
Yasudo, Ryota ;
Coutinho, Jose G. F. ;
Varbanescu, Ana-Lucia ;
Luk, Wayne ;
Amano, Hideharu ;
Becker, Tobias ;
Guo, Ce .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2021, 14 (03)
[32]   Block coordinate descent algorithms for large-scale sparse multiclass classification [J].
Blondel, Mathieu ;
Seki, Kazuhiro ;
Uehara, Kuniaki .
MACHINE LEARNING, 2013, 93 (01) :31-52
[33]   Evidence Estimation in Gaussian Graphical Models Using a Telescoping Block Decomposition of the Precision Matrix [J].
Bhadra, Anindya ;
Sagar, Ksheera ;
Rowe, David ;
Banerjee, Sayantan ;
Datta, Jyotishka .
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[34]   Stochastic Augmented Projected Gradient Methods for the Large-Scale Precoding Matrix Indicator Selection Problem [J].
Zhang, Jiaqi ;
Jin, Zeyu ;
Jiang, Bo ;
Wen, Zaiwen .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (11) :9553-9565
[35]   Block-wise primal-dual algorithms for large-scale doubly penalized ANOVA modeling [J].
Fu, Penghui ;
Tan, Zhiqiang .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 194
[36]   Large-Scale Affine Matrix Rank Minimization With a Novel Nonconvex Regularizer [J].
Wang, Zhi ;
Liu, Yu ;
Luo, Xin ;
Wang, Jianjun ;
Gao, Chao ;
Peng, Dezhong ;
Chen, Wu .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (09) :4661-4675
[37]   KETCHUP: Parameterizing of large-scale kinetic models using multiple datasets with different reference states [J].
Hu, Mengqi ;
Suthers, Patrick F. ;
Maranas, Costas D. .
METABOLIC ENGINEERING, 2024, 82 :123-133
[38]   A dynamic approach to energy efficiency estimation in the large-scale chemical plant [J].
Zhu, Li ;
Chen, Junghui .
JOURNAL OF CLEANER PRODUCTION, 2019, 212 :1072-1085
[39]   A Randomized Block Sampling Approach to Canonical Polyadic Decomposition of Large-Scale Tensors [J].
Vervliet, Nico ;
De Lathauwer, Lieven .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2016, 10 (02) :284-295
[40]   Nearest neighbor density ratio estimation for large-scale applications in astronomy [J].
Kremer, J. ;
Gieseke, F. ;
Pedersen, K. Steenstrup ;
Igel, C. .
ASTRONOMY AND COMPUTING, 2015, 12 :67-72