BATMAN: Techniques for Maximizing System Bandwidth of Memory Systems with Stacked-DRAM

被引：28

作者：

Chou, Chiachen ^{[1
]}

Jaleel, Aamer ^{[2
]}

Qureshi, Moinuddin ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch ECE, Atlanta, GA 30332 USA

[2] NVIDIA, NVIDIA Res, Santa Clara, CA USA

来源：

MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS | 2017年

基金：

美国国家科学基金会;

关键词：

POLICIES;

D O I：

10.1145/3132402.3132404

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Tiered-memory systems consist of high-bandwidth 3D-DRAM and high-capacity commodity-DRAM. Conventional designs attempt to improve system performance by maximizing the number of memory accesses serviced by 3D-DRAM. However, when the commodity-DRAM bandwidth is a significant fraction of overall system bandwidth, the techniques ineficiently utilize the total bandwidth offered by the tiered-memory system and yields suboptimal performance. In such situations, the performance can be improved by distributing memory accesses that are proportional to the bandwidth of each memory. Ideally, we want a simple and effective runtime mechanism that achieves the desired access distribution without requiring significant hardware or software support. This paper proposes Bandwidth-Aware Tiered-Memory Management (BATMAN), a runtime mechanism that manages the distribution of memory accesses in a tiered-memory system by explicitly controlling data movement. BATMAN monitors the number of accesses to both memories, and when the number of 3D-DRAM accesses exceeds the desired threshold, BATMAN disallows data movement from the commodity-DRAM to 3D-DRAM and proactively moves data from 3D-DRAM to commodity-DRAM. We demonstrate BATMAN on systems that architect the 3D-DRAM as either a hardware-managed cache (cache mode) or a part of the OS-visible memory space (flat mode). Our evaluations on a system with 4GB 3D-DRAM and 32GB commodity-DRAM show that BATMAN improves performance by an average of 11% and 10% and energy-delay product by 13% and 11% for systems in the cache and flat modes, respectively. BATMAN incurs only an eight-byte hardware overhead and requires negligible software modification.

引用

页码：268 / 280

页数：13

共 47 条

[1]

Agarwal N, 2015, ACM SIGPLAN NOTICES, V50, P607, DOI [10.1145/2775054.2694381, 10.1145/2694344.2694381]

[2]

[Anonymous], P LIN S

[3]

[Anonymous], 2015, 2015 USENIX ANN TECH

[4]

[Anonymous], P ANN C USENIX ANN T

[5]

[Anonymous], 2007, COMPUTER ARCHITECTUR

[6]

Bellosa Frank, 2004, P ACM SIGOPS EUR WOR

[7]

Bolaria Jag, 2011, MICROPROCESSOR REPOR

[8]

BOLOSKY WJ, 1991, SIGPLAN NOTICES, V26, P212, DOI 10.1145/106973.106994

[9]

CHANDRA R, 1994, SIGPLAN NOTICES, V29, P12, DOI 10.1145/195470.195485

[10]

Chang DW, 2013, ASIA S PACIF DES AUT, P657, DOI 10.1109/ASPDAC.2013.6509675

← 1 2 3 4 5 →