Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors

被引:78
作者
Luk, CK [1 ]
机构
[1] Compaq Comp Corp, VSSAD Alpha Dev Grp, Houston, TX 77269 USA
来源
28TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS | 2001年
关键词
D O I
10.1109/ISCA.2001.937430
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hardly predictable data addresses in many irregular applications have rendered prefetching ineffective, In many cases, the only accurate way to predict these addresses is to directly execute the code that generates them. As multithreaded architectures become increasingly popular, one attractive approach is to use idle threads on these machines to perform pre-execution - essentially a combined act of speculative address generation and prefetching - to accelerate the main thread. In this paper, we propose such a pre-execution technique for simultaneous multithreading (SMT) processors. By using software to control pre-execution, we are able to handle some of the Most important access patterns that are typically difficult to prefetch. Compared with existing work oil pre-execution, our technique is significantly simpler to implement (e.g., no integration of pre-execution results, no need of shortening programs for pre-execution, and no need of special hardware to copy register values upon thread spawns). Consequently, only minimal extensions to SMT machines are required to support our technique. Despite its simplicity, our technique offers an average speedup of 24% in a set of irregular applications, which is a 19%, speedup over state-of-the-art software-controlled prefetching.
引用
收藏
页码:40 / 51
页数:12
相关论文
共 36 条
[1]  
Agarwal A., 1990, Proceedings. The 17th Annual International Symposium on Computer Architecture (Cat. No.90CH2887-8), P104, DOI 10.1109/ISCA.1990.134498
[2]   A dynamic multithreading processor [J].
Akkary, H ;
Driscoll, MA .
31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, :226-236
[3]  
ANNAVARAM MM, 2001, P 28 ISCA
[4]  
BALASUBRAMONIAN R, 2001, P 28 ISCA
[5]  
Butenhof D. R., 1997, Programming with POSIX threads
[6]  
Chappell RS, 1999, CONF PROC INT SYMP C, P186, DOI 10.1145/307338.300995
[7]  
Chen T., 1995, IEEE T COMPUTERS, V44
[8]  
COLLINS JD, 2001, P 28 ISCA
[9]  
*COMPAQ COMP CORP, 2000, AS MAN
[10]  
DUBOIS M, 1998, 9825 U SO CAL