Fundamental Latency Trade-offs in Architecting DRAM Caches Outperforming Impractical SRAM-Tags with a Simple and Practical Design

被引:135
作者
Qureshi, Moinuddin K. [1 ]
Loh, Gabriel H. [1 ]
机构
[1] Georgia Inst Technol, Dept Elect & Comp Engn, Atlanta, GA 30332 USA
来源
2012 IEEE/ACM 45TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-45) | 2012年
关键词
D O I
10.1109/MICRO.2012.30
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper analyzes the design trade-offs in architecting large-scale DRAM caches. Prior research, including the recent work from Loh and Hill, have organized DRAM caches similar to conventional caches. In this paper, we contend that some of the basic design decisions typically made for conventional caches (such as serialization of tag and data access, large associativity, and update of replacement state) are detrimental to the performance of DRAM caches, as they exacerbate the already high hit latency. We show that higher performance can be obtained by optimizing the DRAM cache architecture first for latency, and then for hit rate. We propose a latency-optimized cache architecture, called Alloy Cache, that eliminates the delay due to tag serialization by streaming tag and data together in a single burst. We also propose a simple and highly effective Memory Access Predictor that incurs a storage overhead of 96 bytes per core and a latency of 1 cycle. It helps service cache misses faster without the need to wait for a cache miss detection in the common case. Our evaluations show that our latency-optimized cache design significantly outperforms both the recent proposal from Loh and Hill, as well as an impractical SRAM Tag-Store design that incurs an unacceptable overhead of several tens of megabytes. On average, the proposal from Loh and Hill provides 8.7% performance improvement, the "idealized" SRAM Tag design provides 24%, and our simple latency-optimized design provides 35%.
引用
收藏
页码:235 / 246
页数:12
相关论文
共 18 条
[1]  
Dong X., 2010, SUPERCOMPUTING
[2]  
FARRENS M, 1995, MICRO 28
[3]  
Hartstein A., 2006, COMPUTING FRONTIERS
[4]  
HILL MD, 1988, IEEE COMPUTER
[5]  
Jiang Xiaowei, 2010, HPCA 16
[6]  
Khan S. M., 2010, PACT 19
[7]  
Loh G. H., 2011, MICRO 44
[8]  
LOH GH, EFFICIENTLY ENABLING
[9]  
LOH GH, 2012, IEEE MICROTOPPICKS
[10]  
MADAN N, 2009, HPCA 15