High Performance Recursive Matrix Inversion for Multicore Architectures

被引：0

作者：

Mahfoudhi, Ryma ^{[1
]}

Achour, Sami ^{[1
]}

Hamdi-Larbi, Olfa ^{[1
]}

Mahjoub, Zaher ^{[1
]}

机构：

[1] Univ Tunis El Manar, Fac Sci Tunis, UR13ES38, Algorithm Parallele & Optimisat, Tunis 2092, Tunisia

来源：

2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS) | 2017年

关键词：

Divide and Conquer; LU factorization; Matrix inversion; Multicore architecture; PBlas; recursive algorithm; Strassen algorithm;

D O I：

10.1109/HPCS.2017.104

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

There are several approaches for computing the inverse of a dense square matrix, say A, namely Gaussian elimination, block wise inversion, and LU factorization (LUF). The latter is used in mathematical software libraries such as SCALAPACK, PBLAS and MATLAB. The inversion routine in SCALAPACK library (called PDGETRI) consists, once the two factors L and U are known (where A=LU), in first inverting U (PDGETRF) then solving a triangular matrix system giving A(-1). A symmetric way consists in first inverting L, then solving a matrix system giving A(-1). Alternatively, one could compute the inverses of both U and L, then their product and get A(-1). On the other hand, the Strassen fast matrix inversion algorithm is known as an efficient alternative for solving our problem. We propose in this paper a series of different versions for parallel dense matrix inversion based on the 'Divide and Conquer' paradigm. A theoretical performance study permits to establish an accurate comparison between the designed algorithms. We achieved a series of experiments that permit to validate the contribution and lead to efficient performances obtained for large matrix sizes i.e. up to 40% faster than SCALAPACK.

引用

页码：675 / 682

页数：8

共 50 条

[1] A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures
Yu, Denggao
He, Shiwen
Huang, Yongming
Yu, Guangshi
Yang, Luxi
2015 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2015, : 903 - 907
[2] High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures
Nagasaka, Yusuke
Matsuoka, Satoshi
Azad, Ariful
Buluc, Aydin
47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP '18), 2018,
[3] High Performance Recursive LU Factorization for Multicore Systems
Mahfoudhi, Ryma
2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 668 - 674
[4] Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures
Agullo, Emmanuel
Bouwmeester, Henricus
Dongarra, Jack
Kurzak, Jakub
Langou, Julien
Rosenberg, Lee
HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2010, 2011, 6449 : 129 - +
[5] High Performance Architectures for Recursive Loop Algorithms
Jyothi, N.
Madhavi, D.
Kumar, Sumanth
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON CONTROL AUTOMATION, COMMUNICATION AND ENERGY CONSERVATION INCACEC 2009 VOLUME II, 2009, : 563 - +
[6] Application scalability and performance on multicore architectures
Simon, Tyler A.
Cable, Sam B.
Mahmoodi, Mahin
PROCEEDINGS OF THE HPCMP USERS GROUP CONFERENCE 2007, 2007, : 378 - 381
[7] High performance low cost multicore NoC architectures for embedded systems
Tutsch, Dietmar
Hommel, Guenter
EMBEDDED SYSTEMS - MODELING, TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2006, : 53 - +
[8] Camellia: a Novel High Performance On-Chip Network for Multicore Architectures
Chu, Slo-Li
Shu, Sheng-Jie
Chen, Ching-Chung
Chen, Ching-Jung
2015 11TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2015, : 186 - 191
[9] Triangular Matrix Inversion on Heterogeneous Multicore Systems
Ries, Florian
De Marco, Tommaso
Guerrieri, Roberto
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (01) : 177 - 184
[10] A Recursive Method for Inversion of the Vandermonde Matrix
Zhang, Xinjian
PROCEEDINGS OF FIRST INTERNATIONAL CONFERENCE OF MODELLING AND SIMULATION, VOL II: MATHEMATICAL MODELLING, 2008, : 214 - 216

← 1 2 3 4 5 →