Vectorization techniques for the blue Gene/L double FPU

被引:11
作者
Lorenz, J
Kral, S
Franchetti, F
Ueberhuber, CW
机构
[1] Vienna Univ Technol, Inst Anal & Sci Comp, A-1040 Vienna, Austria
[2] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
关键词
D O I
10.1147/rd.492.0437
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents vectorization techniques tailored to meet the specifics of the two-way single-instruction multiple-data (SIMD) double-precision floating-point unit (FPU), which is a core element of the node application-specific integrated circuit (ASIC) chips of the IBM 360-teraflops Blue Gene((R))/L supercomputer. This paper focuses on the general-purpose basic-block vectorization and optimizaton methods as they are incorporated in the Vienna MAP vectorier and optimizer. The innovative technologies presented here, which hape consistently delivered superior performance and portability across a wide range platforms, were carried over to prototypes of Blue Gene/L and joined with the automatic performance-tuning system known as Fastest Fourier Transform in the West (FFTW). FFTW performance-optimization facilities working with the compiler technologies presented in this paper are able to produce vectorized fast Fourier transform (FFT) codes that are tuned automatically to single Blue Gene/L processors and are up to 80% faster than the best-performing scalar FFT codes generated by FFTW.
引用
收藏
页码:437 / 446
页数:10
相关论文
共 23 条
[1]  
[Anonymous], 2005, P IEEE
[2]  
*ARM, 2004, IBM SURG PAST HP LEA
[3]  
FISHER RJ, 1998, P 11 ANN WORKSH LANG, P290
[4]   Efficient utilization of SIMD extensions [J].
Franchetti, F ;
Kral, S ;
Lorenz, J ;
Ueberhuber, CW .
PROCEEDINGS OF THE IEEE, 2005, 93 (02) :409-425
[5]  
Franchetti F, 2003, INT CONF ACOUST SPEE, P537
[6]  
Franchetti F, 2001, INT CONF ACOUST SPEE, P1109, DOI 10.1109/ICASSP.2001.941115
[7]  
FRANCHETTI F, 2002, P INT PAR DISTR PROC, P20
[8]  
Franchetti F., 2003, IMACS S MATH MODELLI, V2, P1539
[9]  
FRANCHETTI F, 2003, THESIS VIENNA U TECH
[10]  
FRANCHETTI F, 2003, P IEEE INT PAR DISTR, P58