Bounds modelling and compiler optimizations for superscalar performance tuning

被引:4
作者
Bose, P
Kim, S
O'Connell, FP
Ciarfella, WA
机构
[1] IBM Corp, TJ Watson Res Ctr, Yorktown Heights, NY 10598 USA
[2] IBM Corp, High End Proc Dev, Austin, TX USA
关键词
loop performance; super scalar processors; bounds analysis; compiler optimization; performance tuning;
D O I
10.1016/S1383-7621(98)00053-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the floating point microarchitecture support in RISC superscalar processors. We briefly review the fundamental performance trade-offs in the design of such microarchitecutres. We propose a simple, yet effective bounds model to deduce the "best-case" loop performance limits for these processors. We compare these bounds to simulated and real performance measurements. From this study, we identify several loop tuning opportunities. In particular. we illustrate the use of this analysis in suggesting loop unrolling and scheduling heuristics. We report our experimental results in the context of a set of application-based loop test cases. These an designed to stress various resource limits in the core (infinite cache) microarchitecture, (C) 1999 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1111 / 1137
页数:27
相关论文
共 18 条
[1]  
[Anonymous], 1994, POWERPC ARCHITECTURE
[2]   COMPILER TRANSFORMATIONS FOR HIGH-PERFORMANCE COMPUTING [J].
BACON, DF ;
GRAHAM, SL ;
SHARP, OJ .
ACM COMPUTING SURVEYS, 1994, 26 (04) :345-420
[3]   ARCHITECTURAL TIMING VERIFICATION OF CMOS RISC PROCESSORS [J].
BOSE, P ;
SURYA, S .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1995, 39 (1-2) :113-129
[4]  
BOSE P, 1995, ISCA 95 WORKSH PRES
[5]  
BOSE P, 1997, P WORKSH INT COMP CO
[6]  
BOSE P, IBM RES REPORT
[7]  
BOSE P, 1995, 20094 IBM RC
[8]   MACHINE ORGANIZATION OF THE IBM RISC SYSTEM-6000 PROCESSOR [J].
GROHOSKI, GF .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1990, 34 (01) :37-58
[9]  
GWENNAP L, 1996, IBM CRAMS POWER2 SIN
[10]  
HANNON EL, 1994, PR IEEE COMP DESIGN, P336, DOI 10.1109/ICCD.1994.331920