Runtime identification of cache conflict misses: The adaptive miss buffer

被引:9
作者
Collins, JD [1 ]
Tullsen, DM [1 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2001年 / 19卷 / 04期
关键词
design; performance; cache architecture; conflict misses; prefetching; victim cache; adaptive miss buffer; cache exclusion;
D O I
10.1145/502912.502913
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes the miss classification table, a simple mechanism that enables the processor or memory controller to identify each cache miss as either a conflict miss or a capacity (non-conflict) miss. The miss classification table works by storing part of the tag of the most recently evicted line of a cache set. If the next miss to that cache set has a matching tag, it is identified as a conflict miss. This technique correctly identifies 88% of misses. Several applications of this information are demonstrated, including improvements to victim caching, next-line prefetching, cache exclusion, and a pseudo-associative cache. This paper also presents the adaptive miss buffer (AMB), which combines several of these techniques, targeting each miss with the most appropriate optimization, all within a single small miss buffer. The AMB's combination of techniques achieves 16% better performance than any single technique alone.
引用
收藏
页码:413 / 439
页数:27
相关论文
共 27 条
[1]  
AGARWAL A, 1993, P 20 ANN INT S COMP, P179
[2]  
[Anonymous], P INT C SUP
[3]  
[Anonymous], 2000, ASPLOS 9
[4]  
BERSHAD BN, 1994, P 6 INT C ARCH SUPP, P158
[5]   EFFECTIVE HARDWARE-BASED DATA PREFETCHING FOR HIGH-PERFORMANCE PROCESSORS [J].
CHEN, TF ;
BAER, JL .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (05) :609-623
[6]   Hardware identification of cache conflict misses [J].
Collins, JD ;
Tullsen, DM .
32ND ANNUAL INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, (MICRO-32), PROCEEDINGS, 1999, :126-135
[7]  
COX AL, 1993, P 20 ANN INT S COMP, P98
[8]  
HILL MD, 1987, THESIS U CALIFORNIA
[9]  
HIRATA H, 1992, ACM COMP AR, V20, P136, DOI 10.1145/146628.139710
[10]  
INTEL C, 2001, INTEL PENTIUM 4 PROC