Parallel Approximations for High-Dimensional Multivariate Normal Probability Computation in Confidence Region Detection Applications

被引:0
|
作者
Zhang, Xiran [1 ,2 ]
Abdulah, Sameh [1 ,3 ]
Cao, Jian [4 ]
Ltaief, Hatem [1 ,3 ]
Sun, Ying [1 ,2 ,3 ]
Genton, Marc G. [1 ,2 ,3 ]
Keyes, David E. [1 ,3 ]
机构
[1] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal, Saudi Arabia
[2] King Abdullah Univ Sci & Technol, Stat Program, Thuwal, Saudi Arabia
[3] King Abdullah Univ Sci & Technol, Extreme Comp Res Ctr, Thuwal 23955, Saudi Arabia
[4] Univ Houston, Dept Math, Houston, TX USA
来源
PROCEEDINGS 2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS 2024 | 2024年
关键词
Cholesky factorization; Confidence region detection; Excursion Set; Multivariate normal probability; Separation-of-Variables; algorithm; Tile low-rank;
D O I
10.1109/IPDPS57955.2024.00031
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Addressing the statistical challenge of computing the multivariate normal (MVN) probability in high dimensions holds significant potential for enhancing various applications. For example, the critical task of detecting confidence regions where a process probability surpasses a specific threshold is essential in diverse applications, such as pinpointing tumor locations in magnetic resonance imaging (MRI) scan images, determining hydraulic parameters in groundwater flow issues, and forecasting regional wind power to optimize wind turbine placement, among numerous others. One common way to compute high-dimensional MVN probabilities is the Separation-of-Variables (SOV) algorithm. This algorithm is known for its high computational complexity of O(n(3)) and space complexity of O(n(2)), mainly due to a Cholesky factorization operation for an n x n covariance matrix, where n represents the dimensionality of the MVN problem. This work proposes a high-performance computing framework that allows scaling the SOV algorithm and, subsequently, the confidence region detection algorithm. The framework leverages parallel linear algebra algorithms with a task-based programming model to achieve performance scalability in computing process probabilities, especially on large-scale systems. In addition, we enhance our implementation by incorporating Tile Low-Rank (TLR) approximation techniques to reduce algorithmic complexity without compromising the necessary accuracy. To evaluate the performance and accuracy of our framework, we conduct assessments using simulated data and a wind speed dataset. Our proposed implementation effectively handles high-dimensional multivariate normal (MVN) probability computations on shared and distributed-memory systems using finite precision arithmetics and TLR approximation computation. Performance results show a significant speedup of up to 20X in solving the MVN problem using TLR approximation compared to the reference dense solution without sacrificing the application's accuracy. The qualitative results on synthetic and real datasets demonstrate how we maintain high accuracy in detecting confidence regions even when relying on TLR approximation to perform the underlying linear algebra operations.
引用
收藏
页码:265 / 276
页数:12
相关论文
共 49 条
  • [1] Fast computation of high-dimensional multivariate normal probabilities
    Phinikettos, Ioannis
    Gandy, Axel
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (04) : 1521 - 1529
  • [2] Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities
    Genton, Marc G.
    Keyes, David E.
    Turkiyyah, George
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (02) : 268 - 277
  • [3] Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities
    Cao, Jian
    Genton, Marc G.
    Keyes, David E.
    Turkiyyah, George M.
    STATISTICS AND COMPUTING, 2019, 29 (03) : 585 - 598
  • [4] Hierarchical-block conditioning approximations for high-dimensional multivariate normal probabilities
    Jian Cao
    Marc G. Genton
    David E. Keyes
    George M. Turkiyyah
    Statistics and Computing, 2019, 29 : 585 - 598
  • [5] Testing independence in high-dimensional multivariate normal data
    Najarzadeh, D.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (14) : 3421 - 3435
  • [6] On new robust tests for the multivariate normal mean vector with high-dimensional data and applications
    de Paula Alves, Henrique Toss
    Ferreira, Daniel Furtado
    CHILEAN JOURNAL OF STATISTICS, 2020, 11 (02): : 117 - 136
  • [7] High-Dimensional Meets Parallel: Algorithms and Applications
    Bungartz, Hans-Joachim
    Pflueger, Dirk
    Hegland, Markus
    PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 563 - 563
  • [8] High-Dimensional Probability: An Introduction with Applications in Data Science
    Papaspiliopoulos, Omiros
    QUANTITATIVE FINANCE, 2020, 20 (10) : 1591 - 1594
  • [9] Parallel computation of high-dimensional robust correlation and covariance matrices
    Chilson, James
    Ng, Raymond
    Wagner, Alan
    Zamar, Ruben
    ALGORITHMICA, 2006, 45 (03) : 403 - 431
  • [10] Parallel Computation of High-Dimensional Robust Correlation and Covariance Matrices
    James Chilson
    Raymond Ng
    Alan Wagner
    Ruben Zamar
    Algorithmica, 2006, 45 : 403 - 431