Scalarization on short vector machines

被引:1
作者
Zhao, Y [1 ]
Kennedy, K [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
来源
ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE | 2005年
关键词
array syntax; scalarization; vectorization; SIMD; data alignment; memory hierarchy performance; loop alignment; vectorized scalar replacement;
D O I
10.1109/ISPASS.2005.1430573
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Scalarization is a process that converts array statements into loop nests so that they can run on a scalar machine. One technical difficulty of scalarization is that temporary storage often needs to be allocated in order to preserve the semantics of array syntax - "fetch before store". Many techniques have been developed to reduce the size of temporary storage requirement in order to improve the memory hierarchy performance. With the emergence of short vector units on modern microprocessors, it is interesting to see how to extend the preexisting scalarization methods so that the underlying vector infrastructure is fully utilized, while at the same time keep the temporary storage minimized. In this paper, we extend a loop alignment algorithm for scalarization on short vector machines. The revised algorithm not only achieves vector execution with minimum temporary storage, but also handles data alignment properly, which is very important for performance. Our experiments on two types of widely available architectures demonstrate the effectiveness of our strategy.
引用
收藏
页码:187 / 196
页数:10
相关论文
共 10 条
[1]   VECTOR REGISTER ALLOCATION [J].
ALLEN, R ;
KENNEDY, K .
IEEE TRANSACTIONS ON COMPUTERS, 1992, 41 (10) :1290-1317
[2]  
ALLEN R., 2001, OPTIMIZING COMPILERS
[3]   Automatic intra-register vectorization for the Intel® architecture [J].
Bik, AJC ;
Girkar, M ;
Grey, PM ;
Tian, XM .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2002, 30 (02) :65-98
[4]  
CARR S, 1994, ACM T PROGR LANG SYS, V15, P400
[5]  
Eichenberger A. E., 2004, PLDI 04
[6]  
FRABOULET A, 1999, 12 INT S SYST S NOV
[7]  
Larsen S., 2000, PLDI
[8]  
Shin J., 2002, PACT
[9]   Scalarization using loop alignment and loop skewing [J].
Zhao, Y ;
Kennedy, K .
JOURNAL OF SUPERCOMPUTING, 2005, 31 (01) :5-46
[10]  
ZHAO Y, 2001, P 2 LOS AL COMP SCI