Block-enhanced precision matrix estimation for large-scale datasets

被引:5
|
作者
Eftekhari, Aryan [1 ]
Pasadakis, Dimosthenis [1 ]
Bollhoefer, Matthias [2 ]
Scheidegger, Simon [3 ]
Schenk, Olaf [1 ]
机构
[1] Univ Svizzera Italiana, Fac Informat, Inst Comp, Lugano, Switzerland
[2] TU Braunschweig, Inst Numer Anal, Braunschweig, Germany
[3] Univ Lausanne, Dept Econ, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
Covariance matrices; Graphical model; Optimization; Gaussian Markov random field; Machine learning application; SPARSE; SELECTION; PARALLEL; SOLVER; MODEL;
D O I
10.1016/j.jocs.2021.101389
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The l(1)-regularized Gaussian maximum likelihood method is a common approach for sparse precision matrix estimation, but one that poses a computational challenge for high-dimensional datasets. We present a novel l(1)-regularized maximum likelihood method for performant large-scale sparse precision matrix estimation utilizing the block structures in the underlying computations. We identify the computational bottlenecks and contribute a block coordinate descent update as well as a block approximate matrix inversion routine, which is then parallelized using a shared-memory scheme. We demonstrate the effectiveness, accuracy, and performance of these algorithms. Our numerical examples and comparative results with various modern open-source packages reveal that these precision matrix estimation methods can accelerate the computation of covariance matrices by two to three orders of magnitude, while keeping memory requirements modest. Furthermore, we conduct large-scale case studies for applications from finance and medicine with several thousand random variables to demonstrate applicability for real-world datasets.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] The fastclime package for linear programming and large-scale precision matrix estimation in R
    Pang, Haotian
    Liu, Han
    Vanderbei, Robert
    Journal of Machine Learning Research, 2014, 15 : 489 - 493
  • [2] The FASTCLIME Package for Linear Programming and Large-Scale Precision Matrix Estimation in R
    Pang, Haotian
    Liu, Han
    Vanderbei, Robets
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 489 - 493
  • [3] An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units
    Young-Geun Choi
    Seunghwan Lee
    Donghyeon Yu
    Computational Statistics, 2022, 37 : 419 - 443
  • [4] An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units
    Choi, Young-Geun
    Lee, Seunghwan
    Yu, Donghyeon
    COMPUTATIONAL STATISTICS, 2022, 37 (01) : 419 - 443
  • [5] Consistent Matrix: A Feature Selection Framework for Large-Scale Datasets
    Yang, Tian
    Li, Yuan-Jiang
    Qian, Yuhua
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (11) : 4024 - 4038
  • [6] A branch-and-bound algorithm with growing datasets for large-scale estimation
    Sass, Susanne
    Mitsos, Alexander
    Bongartz, Dominik
    Bell, Ian H.
    Nikolov, Nikolay I.
    Tsoukalas, Angelos
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 316 (01) : 36 - 45
  • [7] A Large-Scale Distributed Traffic Matrix Estimation Algorithm
    Ni, Jian
    Tatikonda, Sekhar
    Yeh, Edmund M.
    GLOBECOM 2006 - 2006 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2006,
  • [8] LARGE-SCALE SPARSE INVERSE COVARIANCE MATRIX ESTIMATION
    Bollhoefer, Matthias
    Eftekhari, Aryan
    Scheidegger, Simon
    Schenk, Olaf
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2019, 41 (01): : A380 - A401
  • [9] Accurate estimation of large-scale IP traffic matrix
    Jiang, Dingde
    Wang, Xingwei
    Guo, Lei
    Ni, Haizhuan
    Chen, Zhenhua
    AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2011, 65 (01) : 75 - 86
  • [10] Visualization of large-scale trajectory datasets
    Zachar, Gergely
    2023 CYBER-PHYSICAL SYSTEMS AND INTERNET-OF-THINGS WEEK, CPS-IOT WEEK WORKSHOPS, 2023, : 152 - 157