SEQUENTIAL HARDWARE PREFETCHING IN SHARED-MEMORY MULTIPROCESSORS

被引:70
作者
DAHLGREN, F [1 ]
DUBOIS, M [1 ]
STENSTROM, P [1 ]
机构
[1] UNIV SO CALIF, DEPT ELECT ENGN SYST, LOS ANGELES, CA 90089 USA
基金
美国国家科学基金会;
关键词
HARDWARE-CONTROLLED PREFETCHING; LATENCY TOLERANCE; MEMORY CONSISTENCY MODELS; PERFORMANCE EVALUATION; SEQUENTIAL PREFETCHING; SHARED-MEMORY MULTIPROCESSORS;
D O I
10.1109/71.395402
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To offset the effect of read miss penalties on processor utilization in shared-memory multiprocessors, several software- and hardware-based data prefetching schemes have been proposed, A major advantage of hardware techniques is that they need no support from the programmer or compiler. Sequential prefetching is a simple hardware-controlled prefetching technique which relies on the automatic prefetch of consecutive blocks following the block that misses in the cache, thus exploiting spatial locality. In its simplest form, the number of prefetched blocks: on each miss Is fixed throughout the execution, However, since the prefetching efficiency varies during the execution of a program, we propose to adapt the number of prefetched blocks according to a dynamic measure of prefetching effectiveness, Simulations of this adaptive scheme show reductions of the number of read misses, the read penalty, and of the execution time by up to 78%, 58%, and 25% respectively,
引用
收藏
页码:733 / 746
页数:14
相关论文
共 29 条
[11]  
DUBOIS M, 1993, CONF PROC INT SYMP C, P88
[12]  
EGGERS SJ, 1989, P ASPLOS, V3, P257
[13]  
FU JWC, 1991, ACM COMP AR, V19, P54, DOI 10.1145/115953.115959
[14]  
GHARACHORLOO K, 1990, 17TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, P15, DOI 10.1109/ISCA.1990.134503
[15]  
GHARACHORLOO K, 1991, P ASPLOS, V4
[16]   CACHE INVALIDATION PATTERNS IN SHARED-MEMORY MULTIPROCESSORS [J].
GUPTA, A ;
WEBER, WD .
IEEE TRANSACTIONS ON COMPUTERS, 1992, 41 (07) :794-810
[17]  
HAGERSTEN E, 1992, THESIS SWEDISH I COM
[18]  
LAMPORT L, 1979, IEEE T COMPUT, V28, P690, DOI 10.1109/TC.1979.1675439
[19]  
LEE RL, 1987, 1987 P INT C PAR PRO, P28
[20]   THE STANFORD DASH MULTIPROCESSOR [J].
LENOSKI, D ;
LAUDON, J ;
GHARACHORLOO, K ;
WEBER, WD ;
GUPTA, A ;
HENNESSY, J ;
HOROWITZ, M ;
LAM, MS .
COMPUTER, 1992, 25 (03) :63-79