Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems

被引:133
作者
O'Connor, Mike [1 ,2 ]
Chatterjee, Niladrish [1 ]
Lee, Donghyuk [1 ]
Wilson, John [1 ]
Agrawal, Aditya [1 ]
Keckler, Stephen W. [1 ,2 ]
Dally, William J. [1 ,3 ]
机构
[1] NVIDIA, Santa Clara, CA 95051 USA
[2] Univ Texas Austin, Austin, TX 78712 USA
[3] Stanford Univ, Stanford, CA 94305 USA
来源
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO) | 2017年
关键词
DRAM; Energy-Efficiency; High Bandwidth; GPU;
D O I
10.1145/3123939.3124545
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Future GPUs and other high-performance throughput processors will require multiple TB/s of bandwidth to DRAM. Satisfying this bandwidth demand within an acceptable energy budget is a challenge in these extreme bandwidth memory systems. We propose a new high-bandwidth DRAM architecture, Fine-Grained DRAM (FGDRAM), which improves bandwidth by 4x and improves the energy efficiency of DRAM by 2x relative to the highest-bandwidth, most energy-efficient contemporary DRAM, High Bandwidth Memory (HBM2). These benefits are in large measure achieved by partitioning the DRAM die into many independent units, called grains, each of which has a local, adjacent I/O. This approach unlocks the bandwidth of all the banks in the DRAM to be used simultaneously, eliminating shared buses interconnecting various banks. Furthermore, the on-DRAM data movement energy is significantly reduced due to the much shorter wiring distance between the cell array and the local I/O. This FGDRAM architecture readily lends itself to leveraging existing techniques to reducing the effective DRAM row size in an area efficient manner, reducing wasteful row activate energy in applications with low locality. In addition, when FGDRAM is paired with a memory controller optimized to exploit the additional concurrency provided by the independent grains, it improves GPU system performance by 19% over an iso-bandwidth and iso-capacity future HBM baseline. Thus, this energy-efficient, high-bandwidth FGDRAM architecture addresses the needs of future extreme-bandwidth memory systems.
引用
收藏
页码:41 / 54
页数:14
相关论文
共 44 条
  • [1] Adams MF, 2014, TECHNICAL REPORT
  • [2] Aila Timo, 2010, P HIGH PERF GRAPH
  • [3] Andersch M, 2015, INT SYM PERFORM ANAL, P169, DOI 10.1109/ISPASS.2015.7095801
  • [4] [Anonymous], P INT C HIGH PERF CO
  • [5] [Anonymous], 2012, JESD79-4: JEDEC Standard DDR4 SDRAM
  • [6] [Anonymous], HPGMG BOF SUPERCOMPU
  • [7] [Anonymous], 2005, ICLUT0501
  • [8] [Anonymous], 2012, S VLSI TECHN
  • [9] [Anonymous], P INT C HIGH PERF CO
  • [10] [Anonymous], 2016, NVIDIA TESL P100 WHI