Block-enhanced precision matrix estimation for large-scale datasets

被引:5
作者
Eftekhari, Aryan [1 ]
Pasadakis, Dimosthenis [1 ]
Bollhoefer, Matthias [2 ]
Scheidegger, Simon [3 ]
Schenk, Olaf [1 ]
机构
[1] Univ Svizzera Italiana, Fac Informat, Inst Comp, Lugano, Switzerland
[2] TU Braunschweig, Inst Numer Anal, Braunschweig, Germany
[3] Univ Lausanne, Dept Econ, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
Covariance matrices; Graphical model; Optimization; Gaussian Markov random field; Machine learning application; SPARSE; SELECTION; PARALLEL; SOLVER; MODEL;
D O I
10.1016/j.jocs.2021.101389
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The l(1)-regularized Gaussian maximum likelihood method is a common approach for sparse precision matrix estimation, but one that poses a computational challenge for high-dimensional datasets. We present a novel l(1)-regularized maximum likelihood method for performant large-scale sparse precision matrix estimation utilizing the block structures in the underlying computations. We identify the computational bottlenecks and contribute a block coordinate descent update as well as a block approximate matrix inversion routine, which is then parallelized using a shared-memory scheme. We demonstrate the effectiveness, accuracy, and performance of these algorithms. Our numerical examples and comparative results with various modern open-source packages reveal that these precision matrix estimation methods can accelerate the computation of covariance matrices by two to three orders of magnitude, while keeping memory requirements modest. Furthermore, we conduct large-scale case studies for applications from finance and medicine with several thousand random variables to demonstrate applicability for real-world datasets.
引用
收藏
页数:13
相关论文
共 50 条
[41]   A Augmented Lagrangian Approach for Distributed Robust Estimation in Large-Scale Systems [J].
Chan, Shing Chow ;
Wu, Ho Chun ;
Ho, Cheuk Hei ;
Zhang, Li .
IEEE SYSTEMS JOURNAL, 2019, 13 (03) :2986-2997
[42]   LARGE-SCALE ESTIMATION OF DOMINANT POLES OF A TRANSFER FUNCTION BY AN INTERPOLATORY FRAMEWORK [J].
Mengi, Emre .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2022, 44 (04) :A2412-A2438
[43]   Large-scale estimation of buildings' thermal load using LiDAR data [J].
Bizjak, Marko ;
Zalik, Borut ;
Stumberger, Gorazd ;
Lukac, Niko .
ENERGY AND BUILDINGS, 2021, 231 (231)
[44]   Precision allocation method of large-scale CNC hobbing machine based on precision-cost comprehensive optimization [J].
Hu, Zongyan ;
Wang, Shilong ;
Ma, Chi .
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2023, 126 (7-8) :3453-3474
[45]   Theoretical analysis of the DAMAS algorithm and efficient implementation of the covariance matrix fitting method for large-scale problems [J].
Chardon, Gilles ;
Picheral, Jose ;
Ollivier, Francois .
JOURNAL OF SOUND AND VIBRATION, 2021, 508
[46]   Dantzig-Wolfe and block coordinate-descent decomposition in large-scale integrated refinery-planning [J].
Alabi, Adebayo ;
Castro, Jordi .
COMPUTERS & OPERATIONS RESEARCH, 2009, 36 (08) :2472-2483
[47]   Accelerated and Inexact Soft-Impute for Large-Scale Matrix and Tensor Completion [J].
Yao, Quanming ;
Kwok, James T. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (09) :1665-1679
[48]   Distributed Double-Layered Dynamic Matrix Control for Large-Scale System [J].
Wang, Li ;
Cai, Yuanli ;
Zan, Xin .
MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
[49]   The station-free sharing bike demand forecasting with a deep learning approach and large-scale datasets [J].
Xu, Chengcheng ;
Ji, Junyi ;
Liu, Pan .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2018, 95 :47-60
[50]   FVR-SGD: A New Flexible Variance-Reduction Method for SGD on Large-Scale Datasets [J].
Tang, Mingxing ;
Huang, Zhen ;
Qiao, Linbo ;
Du, Shuyang ;
Peng, Yuxing ;
Wang, Changjian .
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT II, 2018, 11302 :181-193