Parallel tiled QR factorization for multicore architectures

被引:80
|
作者
Buttari, Alfredo [1 ]
Langou, Julien [2 ]
Kurzak, Jakub [1 ]
Dongarra, Jack [1 ,3 ,4 ]
机构
[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37916 USA
[2] Univ Colorado, Dept Math Sci, Denver, CO 80202 USA
[3] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN USA
[4] Univ Manchester, Manchester, Lancs, England
来源
关键词
multicore; linear algebra; QR factorization;
D O I
10.1002/cpe.1301
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As multicore systems continue to gain ground in the high-performance computing world, linear algebra algorithms have to he reformulated or new algorithms have to he developed in order to take advantage of the architectural features on these new processors. Fine-grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the QR factorization where the operations can he represented as a sequence of small tasks that operate on square blocks of data (referred to as 'tiles'). These tasks can he dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out-of-order execution of the tasks that will completely hide the presence of intrinsically sequential tasks in the factorization. performance comparisons are presented with the LAPACK algorithm for QR factorization where parallelism can be exploited only at the level of the BLAS operations and with vendor implementations. Copyright (E) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:1573 / 1590
页数:18
相关论文
共 50 条
  • [21] Parallel Skyline Computation on Multicore Architectures
    Park, Sungwoo
    Kim, Taekyung
    Park, Jonghyun
    Kim, Jinha
    Im, Hyeonseung
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 760 - 771
  • [22] Optimized sparse Cholesky factorization on hybrid multicore architectures
    Tang, Meng
    Gadou, Mohamed
    Rennich, Steven
    Davis, Timothy A.
    Ranka, Sanjay
    JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 26 : 246 - 253
  • [23] FAST PARALLEL ALGORITHMS FOR QR AND TRIANGULAR FACTORIZATION
    CHUN, J
    KAILATH, T
    LEVARI, H
    SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1987, 8 (06): : 899 - 913
  • [24] Block householder transformation for parallel QR factorization
    Rotella, F
    Zambettakis, I
    APPLIED MATHEMATICS LETTERS, 1999, 12 (04) : 29 - 34
  • [25] Massively parallel Poisson and QR factorization solvers
    Lucka, M
    Vajtersic, M
    Viktorinova, E
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1996, 31 (4-5) : 19 - 26
  • [26] SPARSE QR FACTORIZATION ON A MASSIVELY PARALLEL COMPUTER
    KRATZER, SG
    JOURNAL OF SUPERCOMPUTING, 1992, 6 (3-4): : 237 - 255
  • [27] An Approach of the QR Factorization for Tall-and-Skinny Matrices on Multicore Platforms
    Kuznetsov, Sergey V.
    APPLIED PARALLEL AND SCIENTIFIC COMPUTING (PARA 2012), 2013, 7782 : 235 - 249
  • [28] An efficient parallel set container for multicore architectures
    de Vega, Alvaro
    Andrade, Diego
    Fraguela, Basilio B.
    APPLICATIONS, TOOLS AND TECHNIQUES ON THE ROAD TO EXASCALE COMPUTING, 2012, 22 : 369 - 376
  • [29] PARALLEL PROGRAMMING MODELS FOR HETEROGENEOUS MULTICORE ARCHITECTURES
    Ferrer, Roger
    Bellens, Pieter
    Beltran, Vicenc
    Gonzalez, Marc
    Martorell, Xavier
    Badia, Rosa M.
    Ayguade, Eduard
    Yeom, Jae-Seung
    Schneider, Scott
    Koukos, Konstantinos
    Alvanos, Michail
    Nikolopoulos, Dimitrios S.
    Bilas, Angelos
    IEEE MICRO, 2010, 30 (05) : 42 - 53
  • [30] Parallel query processing in databases on multicore architectures
    Acker, Ralph
    Roth, Christian
    Bayer, Rudolf
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, PROCEEDINGS, 2008, 5022 : 2 - +