CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories

被引:288
作者
Balasubramonian, Rajeev [1 ,3 ]
Kahng, Andrew B. [2 ]
Muralimanohar, Naveen
Shafiee, Ali [1 ]
Srinivas, Vaishnav [2 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
[2] Univ Calif San Diego, San Diego, CA 92103 USA
[3] 50 S Cent Campus Dr,Rm 3190, Salt Lake City, UT 84112 USA
关键词
Memory; DRAM; NVM; interconnects; tools; Design; Algorithms; Performance;
D O I
10.1145/3085572
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Historically, server designers have opted for simple memory systems by picking one of a few commoditized DDR memory products. We are already witnessing a major upheaval in the off-chip memory hierarchy, with the introduction of many new memory products-buffer-on-board, LRDIMM, HMC, HBM, and NVMs, to name a few. Given the plethora of choices, it is expected that different vendors will adopt different strategies for their high-capacity memory systems, often deviating from DDR standards and/or integrating new functionality within memory systems. These strategies will likely differ in their choice of interconnect and topology, with a significant fraction of memory energy being dissipated in I/O and data movement. To make the case for memory interconnect specialization, this paper makes three contributions. First, we design a tool that carefully models I/O power in the memory system, explores the design space, and gives the user the ability to define new types of memory interconnects/topologies. The tool is validated against SPICE models, and is integrated into version 7 of the popular CACTI package. Our analysis with the tool shows that several design parameters have a significant impact on I/O power. We then use the tool to help craft novel specialized memory system channels. We introduce a new relay-on-board chip that partitions a DDR channel into multiple cascaded channels. We show that this simple change to the channel topology can improve performance by 22% for DDR DRAM and lower cost by up to 65% for DDR DRAM. This new architecture does not require any changes to DIMMs, and it efficiently supports hybrid DRAM/NVM systems. Finally, as an example of a more disruptive architecture, we design a custom DIMM and parallel bus that moves away from the DDR3/DDR4 standards. To reduce energy and improve performance, the baseline data channel is split into three narrow parallel channels and the on-DIMM interconnects are operated at a lower frequency. In addition, this allows us to design a two-tier error protection strategy that reduces data transfers on the interconnect. This architecture yields a performance improvement of 18% and a memory power reduction of 23%. The cascaded channel and narrow channel architectures serve as case studies for the new tool and show the potential for benefit from re-organizing basic memory interconnects.
引用
收藏
页数:25
相关论文
共 57 条
  • [1] AMP, 2014, TE DDR2 CONN MOD
  • [2] [Anonymous], 2015, LRDIMM
  • [3] [Anonymous], 2010, ACM SIGOPS Operating Systems Review, DOI DOI 10.1145/1713254.1713276
  • [4] [Anonymous], 2012, Tech. Rep. UUCS-12-002
  • [5] THE NAS PARALLEL BENCHMARKS
    BAILEY, DH
    BARSZCZ, E
    BARTON, JT
    BROWNING, DS
    CARTER, RL
    DAGUM, L
    FATOOHI, RA
    FREDERICKSON, PO
    LASINSKI, TA
    SCHREIBER, RS
    SIMON, HD
    VENKATAKRISHNAN, V
    WEERATUNGA, SK
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1991, 5 (03): : 63 - 73
  • [6] Burr G. W., 2010, PHASE CHANGE MEMORY
  • [7] Chandrasekar K., 2012, TECHNICAL REPORT
  • [8] Dell, 2010, DELL POWEREDGE 11 GE
  • [9] Dell, 2014, DELL POWEREDGE R910
  • [10] Dong X., 2012, TECHNICAL REPORT