Exploiting ILP, TLP, and DLP to Improve Multi-Core Performance of One-Sided Jacobi SVD

被引:0
作者
Soliman, Mostafa I. [1 ]
机构
[1] South Valley Univ, Aswan Fac Engn, Elect Engn Dept, Comp & Syst Sect, Aswan 81542, Egypt
关键词
multi-core computing; multi-threading techniques; ILP; TLP; DLP; SVD; one-sided Jacobi; block algorithms; high-performance computing; performance evaluation;
D O I
10.1142/S0129626409000262
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper shows how the performance of singular value decomposition (SVD) is enhanced through the exploitation of ILP, TLP, and DLP on Intel multi-core processors using superscalar execution, multi-threading computation, and streaming SIMD extensions, respectively. To facilitate the exploitation of TLP on multiple execution cores, the well-known cyclic one-sided Jacobi algorithm is restructured to work in parallel. On two dual-core Intel Xeon processors with hyper-threading technology running at 3.0 GHz, our results show that the multi-threaded implementation of one-sided Jacobi SVD gives about four times faster than the single-threaded superscalar implementation. Furthermore, the multi-threaded SIMD implementation speeds up the execution of single threaded one-sided Jacobi by a factor of 10, which is close to the ideal speedup. On a reasonable large matrix size fitted in the L2 cache, our results show a performance of 11 kA w GFLOPS (double-precision) is achieved on the target system through the exploitation cr, of ILP, TLP, and DLP as well as memory hierarchy.
引用
收藏
页码:355 / 375
页数:21
相关论文
共 35 条
[1]  
Akhter S., 2006, MULTI CORE PROGMMMIN
[2]  
Anderson E., 1992, LAPACK USERS GUIDE
[3]  
[Anonymous], 2006, INTEL 64 IA32 ARCHIT, V1
[4]  
Athanttuaki E., 2005, P 10 PANH C INF GREE
[5]   Dynamic ordering for a parallel block-Jacobi SVD algorithm [J].
Becka, M ;
Oksa, G ;
Vajtersic, M .
PARALLEL COMPUTING, 2002, 28 (02) :243-262
[6]  
Becka M., 1999, PARALLEL ALGORITHMS, V14, P37, DOI DOI 10.1080/10637199808947370
[7]  
Binstock A., 2003, PROGRAMMING HYPER TH
[8]   THE SOLUTION OF SINGULAR-VALUE AND SYMMETRIC EIGENVALUE PROBLEMS ON MULTIPROCESSOR ARRAYS [J].
BRENT, RP ;
LUK, FT .
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1985, 6 (01) :69-84
[9]  
DONGARRA J, 2002, SOURCEBOOK PARALLEL
[10]  
GOLUB GH, 1993, MATRIX COMPUTATIONS