Decoupled Fused Cache: Fusing a Decoupled LLC with a DRAM Cache

被引:12
|
作者
Vasilakis, Evangelos [1 ]
Papaefstathiou, Vassilis [2 ]
Trancoso, Pedro [1 ]
Sourdis, Ioannis [1 ]
机构
[1] Chalmer Univ Technol, CSE Dept, Rannvagen 6, Gothenburg, Sweden
[2] Fdn Res & Technol Hellas FORTH, 100 Nikolaou Plastira Str, Iraklion, Greece
基金
欧洲研究理事会; 欧盟地平线“2020”;
关键词
Caches; 3D stacking; DRAM; processor; memory;
D O I
10.1145/3293447
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
DRAM caches have shown excellent potential in capturing the spatial and temporal data locality of applications capitalizing on advances of 3D-stacking technology; however, they are still far from their ideal performance. Besides the unavoidable DRAM access to fetch the requested data, tag access is in the critical path, adding significant latency and energy costs. Existing approaches are not able to remove these overheads and in some cases limit DRAM cache design options. For instance, caching DRAM cache tags adds constant latency to every access; accessing the DRAM cache using the 'I'1,B calls for OS support and DRAM cachelines as large as a page; reusing the last-level cache (LLC) tags to access the DRAM cache limits LLC performance as it requires indexing the LLC using higher-order address bits. In this article, we introduce Decoupled Fused Cache, a DRAM cache design that alleviates the cost of tag accesses by fusing DRAM cache tags with the tags of the on-chip LLC without affecting LLC performance. In essence, the Decoupled Fused Cache relies in most cases on the LLC tag access to retrieve the required information for accessing the DRAM cache while avoiding additional overheads. Compared to current DRAM cache designs of the same cacheline size, Decoupled Fused Cache improves system performance by 6% on average and by 16% to 18% for large cacheline sizes. Finally, Decoupled Fused Cache reduces DRAM cache traffic by 18% and DRAM cache energy consumption by 7%.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Reducing Latency in an SRAM/DRAM Cache Hierarchy via a Novel Tag-Cache Architecture
    Hameed, Fazal
    Bauer, Lars
    Henkel, Joerg
    2014 51ST ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2014,
  • [32] TDV cache: Organizing Off-Chip DRAM Cache of NVMM from a Fusion Perspective
    Lu, Tianyue
    Liu, Yuhang
    Pan, Haiyang
    Chen, Mingyu
    2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 65 - 72
  • [33] Work-in-Progress: DRAM Cache Access Optimization leveraging Line Locking in Tag Cache
    Tripathy, Shivani
    Sahoo, Debiprasanna
    Satpathy, Manoranjan
    2018 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES), 2018,
  • [34] Defending Against Flush plus Reload Attack With DRAM Cache by Bypassing Shared SRAM Cache
    Jang, Minwoo
    Lee, Seungkyu
    Kung, Jaeha
    Kim, Daehoon
    IEEE ACCESS, 2020, 8 (08): : 179837 - 179844
  • [35] AN EXPERIMENTAL 1-MBIT CACHE DRAM WITH ECC
    ASAKURA, M
    MATSUDA, Y
    HIDAKA, H
    TANAKA, Y
    FUJISHIMA, K
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1990, 25 (01) : 5 - 10
  • [36] Dense Footprint Cache: Capacity-Efficient Die-Stacked DRAM Last Level Cache
    Shin, Seunghee
    Kim, Sihong
    Solihin, Yan
    MEMSYS 2016: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2016, : 191 - 203
  • [37] TCache: An Energy-Efficient DRAM Cache Design
    He, Jiacong
    Callenes-Sloan, Joseph
    2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
  • [38] Exploring DRAM Cache Architectures for CMP Server Platforms
    Zhao, Li
    Iyer, Ravi
    Illikkal, Ramesh
    Newell, Don
    2007 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, VOLS, 1 AND 2, 2007, : 55 - 62
  • [39] Process variation aware DRAM-Cache resizing
    Agarwalla, Bindu
    Das, Shirshendu
    Sahu, Nilkanta
    JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 123
  • [40] CiDRA: A Cache-inspired DRAM Resilience Architecture
    Son, Young Hoon
    Lee, Sukhan
    Seongil, O.
    Kwon, Sanghyuk
    Kim, Nam Sung
    Ahn, Jung Ho
    2015 IEEE 21ST INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2015, : 502 - 513