Scalarization on short vector machines

被引：1

作者：

Zhao, Y ^{[1
]}

Kennedy, K ^{[1
]}

机构：

[1] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA

来源：

ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE | 2005年

关键词：

array syntax; scalarization; vectorization; SIMD; data alignment; memory hierarchy performance; loop alignment; vectorized scalar replacement;

D O I：

10.1109/ISPASS.2005.1430573

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Scalarization is a process that converts array statements into loop nests so that they can run on a scalar machine. One technical difficulty of scalarization is that temporary storage often needs to be allocated in order to preserve the semantics of array syntax - "fetch before store". Many techniques have been developed to reduce the size of temporary storage requirement in order to improve the memory hierarchy performance. With the emergence of short vector units on modern microprocessors, it is interesting to see how to extend the preexisting scalarization methods so that the underlying vector infrastructure is fully utilized, while at the same time keep the temporary storage minimized. In this paper, we extend a loop alignment algorithm for scalarization on short vector machines. The revised algorithm not only achieves vector execution with minimum temporary storage, but also handles data alignment properly, which is very important for performance. Our experiments on two types of widely available architectures demonstrate the effectiveness of our strategy.

引用

页码：187 / 196

页数：10

共 10 条

[1] VECTOR REGISTER ALLOCATION [J].

ALLEN, R ;

KENNEDY, K .

IEEE TRANSACTIONS ON COMPUTERS, 1992, 41 (10) :1290-1317

[2]

ALLEN R., 2001, OPTIMIZING COMPILERS

[3] Automatic intra-register vectorization for the Intel® architecture [J].

Bik, AJC ;

Girkar, M ;

Grey, PM ;

Tian, XM .

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2002, 30 (02) :65-98

[4]

CARR S, 1994, ACM T PROGR LANG SYS, V15, P400

[5]

Eichenberger A. E., 2004, PLDI 04

[6]

FRABOULET A, 1999, 12 INT S SYST S NOV

[7]

Larsen S., 2000, PLDI

[8]

Shin J., 2002, PACT

[9] Scalarization using loop alignment and loop skewing [J].

Zhao, Y ;

Kennedy, K .

JOURNAL OF SUPERCOMPUTING, 2005, 31 (01) :5-46

[10]

ZHAO Y, 2001, P 2 LOS AL COMP SCI

← 1 →