GPU-Accelerated Adaptive PCBSO Mode-Based Hybrid RLA for Sparse LU Factorization in Circuit Simulation

被引:5
作者
Lee, Wai-Kong [1 ]
Achar, Ramachandra [2 ]
机构
[1] Gachon Univ, Dept Comp Engn, Seongnam 13120, South Korea
[2] Carleton Univ, Dept Elect, Ottawa, ON K1S 5B6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Circuit simulation; graphics processing unit (GPU); left-looking algorithm (LLA); LU factorization; multicore; parallel simulation; right-looking algorithm (RLA); simulation program with integrated circuit emphasis (SPICE); sparse matrices;
D O I
10.1109/TCAD.2020.3046572
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
LU factorization is extensively used in engineering and scientific computations for solution of large set of linear equations. Particularly, circuit simulators rely heavily on sparse version of LU factorization for solution involving circuit matrices. One of the recent advances in this field is exploiting the emerging computing platform of graphics processing units (GPUs) for parallel and sparse LU factorization. In this article, following contributions are made to advance the state of the art in hybrid right-looking algorithm (RLA): 1) a novel GPU kernel based on parallel column and block size optimization (PCBSO) is developed for adaptively allocating the block size while optimizing the number of columns for parallel execution based on the size of their associated submatrices at every level. The proposed approach helps to minimize the resource contention and to improve the computational performance and 2) an algorithm is developed to enable the execution of the new adaptive mode with dynamic parallelism. Also, a comprehensive performance comparison using a set of benchmark circuit examples is presented. The results indicate that, the proposed advancements can improve the results of state-of-the-art right looking sparse LU factorization in GPU by 1.54x (Arithmetic Mean).
引用
收藏
页码:2320 / 2330
页数:11
相关论文
共 20 条
[1]   Algorithm 837: AMD, an approximate minimum degree ordering algorithm [J].
Amestoy, PR ;
Enseeiht-Irit ;
Davis, TA ;
Duff, IS .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2004, 30 (03) :381-388
[2]  
[Anonymous], 2019, CUDA PROGRAMMING GUI
[3]   GPU-Accelerated Sparse LU Factorization for Circuit Simulation with Performance Modeling [J].
Chen, Xiaoming ;
Ren, Ling ;
Wang, Yu ;
Yang, Huazhong .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (03) :786-795
[4]   NICSLU: An Adaptive Sparse Matrix Solver for Parallel Circuit Simulation [J].
Chen, Xiaoming ;
Wang, Yu ;
Yang, Huazhong .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2013, 32 (02) :261-274
[5]  
Davis T., SPARSESUITE MATRIX C
[6]   Algorithm 907: KLU, A Direct Sparse Solver for Circuit Simulation Problems [J].
Davis, Timothy A. ;
Natarajan, Ekanathan Palamadai .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2010, 37 (03)
[7]  
George T., 2011, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), P372, DOI 10.1109/IPDPS.2011.44
[8]   SPARSE PARTIAL PIVOTING IN TIME PROPORTIONAL TO ARITHMETIC OPERATIONS [J].
GILBERT, JR ;
PEIERLS, T .
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1988, 9 (05) :862-874
[9]  
GLU, GPU ACC SPARS PAR LU
[10]  
He K, 2016, IEEE T VLSI SYST, V24, P1140, DOI 10.1109/TVLSI.2015.2421287