Exploiting In-Memory Data Patterns for Performance Improvement on Crossbar Resistive Memory

被引：3

作者：

Wen, Wen ^{[1
]}

Zhao, Lei ^{[2
]}

Zhang, Youtao ^{[2
]}

Yang, Jun ^{[1
]}

机构：

[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15261 USA

[2] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2020年 / 39卷 / 10期

基金：

美国国家科学基金会;

关键词：

Computer architecture; Microprocessors; Resistance; Random access memory; Correlation; Switches; Nonvolatile memory; Crossbar array; data pattern; resistive memory (ReRAM); write performance; DEVICE; ENERGY; TECHNOLOGY; CHALLENGES; FUTURE; MODEL; ARRAY; WRITE;

D O I：

10.1109/TCAD.2019.2940685

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Resistive memory (ReRAM) has emerged as a promising nonvolatile memory technology that may replace a significant portion of DRAM in future computer systems. ReRAM has many advantages, such as high density, low standby power, and good scalability. When adopting crossbar architecture, ReRAM cell can achieve the smallest theoretical size in fabrication, which is ideal for constructing dense memory with large capacity. However, crossbar cell structure suffers from a variety of reliability issues, which come from large voltage drops on long wires. To ensure operation reliability, ReRAM writes conservatively use the worst-case access latency of all cells in ReRAM arrays, which leads to significant performance degradation and dynamic energy waste. In this article, we study the correlation between the ReRAM cell switching latency and the number of cells in low-resistance state (LRS) along bitlines, and propose to dynamically speed up write operations based on bitline data patterns, i.e., the number of LRS cells presented in bitlines. We leverage the intrinsic in-memory processing capability of ReRAM crossbar and propose a low-overhead runtime profiler that effectively tracks the data patterns in different bitlines. To achieve further write latency reduction, we employ data compression and row address dependent memory data layout to reduce the numbers of LRS cells on bitlines. Moreover, we further present two optimization techniques, i.e., selective profiling and fine-grained profiling, to mitigate energy overhead brought by bitline data patterns tracking. The experimental results show that, on average, our design improves system performance by 20.5% and 14.2%, and reduces memory dynamic energy by 20.3% and 12.6%, compared to the baseline and the state-of-the-art crossbar design, respectively.

引用

页码：2347 / 2360

页数：14

共 67 条

[11] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Chi, Ping
Li, Shuangchen
Xu, Cong
Zhang, Tao
Zhao, Jishen
Liu, Yongpan
Wang, Yu
Xie, Yuan
[J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
[12] RRAM Crossbar Array With Cell Selection Device: A Device and Circuit Interaction Study
Deng, Yexin
Huang, Peng
Chen, Bing
Yang, Xiaolin
Gao, Bin
Wang, Juncheng
Zeng, Lang
Du, Gang
Kang, Jinfeng
Liu, Xiaoyan
[J]. IEEE TRANSACTIONS ON ELECTRON DEVICES, 2013, 60 (02) : 719 - 726
[13] NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory
Dong, Xiangyu
Xu, Cong
Xie, Yuan
Jouppi, Norman P.
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2012, 31 (07) : 994 - 1007
[14] Enabling Scientific Computing on Memristive Accelerators
Feinberg, Ben
Vengalam, Uday Kumar Reddy
Whitehair, Nathan
Wang, Shibo
Ipek, Engin
[J]. 2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 367 - 382
[15] Making Memristive Neural Network Accelerators Reliable
Feinberg, Ben
Wang, Shibo
Ipek, Engin
[J]. 2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 52 - 65
[16] Ferreira AP, 2010, DES AUT TEST EUROPE, P914
[17] Fujiki D, 2018, ACM SIGPLAN NOTICES, V53, P1, DOI [10.1145/3173162.3173171, 10.1145/3296957.3173171]
[18] Gao LG, 2012, IEEE INT CONF VLSI, P87, DOI 10.1109/VLSI-SoC.2012.6379011
[19] Govoreanu B, 2011, 2011 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM)
[20] Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication
Hu, Miao
Strachan, John Paul
Li, Zhiyong
Grafals, Emmanuelle M.
Davila, Noraica
Graves, Catherine
Lam, Sity
Ge, Ning
Yang, Jianhua
Williams, R. Stanley
[J]. 2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,

← 1 2 3 4 5 6 7 →