Exploiting In-Memory Data Patterns for Performance Improvement on Crossbar Resistive Memory

被引:3
作者
Wen, Wen [1 ]
Zhao, Lei [2 ]
Zhang, Youtao [2 ]
Yang, Jun [1 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15261 USA
[2] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会;
关键词
Computer architecture; Microprocessors; Resistance; Random access memory; Correlation; Switches; Nonvolatile memory; Crossbar array; data pattern; resistive memory (ReRAM); write performance; DEVICE; ENERGY; TECHNOLOGY; CHALLENGES; FUTURE; MODEL; ARRAY; WRITE;
D O I
10.1109/TCAD.2019.2940685
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Resistive memory (ReRAM) has emerged as a promising nonvolatile memory technology that may replace a significant portion of DRAM in future computer systems. ReRAM has many advantages, such as high density, low standby power, and good scalability. When adopting crossbar architecture, ReRAM cell can achieve the smallest theoretical size in fabrication, which is ideal for constructing dense memory with large capacity. However, crossbar cell structure suffers from a variety of reliability issues, which come from large voltage drops on long wires. To ensure operation reliability, ReRAM writes conservatively use the worst-case access latency of all cells in ReRAM arrays, which leads to significant performance degradation and dynamic energy waste. In this article, we study the correlation between the ReRAM cell switching latency and the number of cells in low-resistance state (LRS) along bitlines, and propose to dynamically speed up write operations based on bitline data patterns, i.e., the number of LRS cells presented in bitlines. We leverage the intrinsic in-memory processing capability of ReRAM crossbar and propose a low-overhead runtime profiler that effectively tracks the data patterns in different bitlines. To achieve further write latency reduction, we employ data compression and row address dependent memory data layout to reduce the numbers of LRS cells on bitlines. Moreover, we further present two optimization techniques, i.e., selective profiling and fine-grained profiling, to mitigate energy overhead brought by bitline data patterns tracking. The experimental results show that, on average, our design improves system performance by 20.5% and 14.2%, and reduces memory dynamic energy by 20.3% and 12.6%, compared to the baseline and the state-of-the-art crossbar design, respectively.
引用
收藏
页码:2347 / 2360
页数:14
相关论文
共 67 条
  • [11] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
    Chi, Ping
    Li, Shuangchen
    Xu, Cong
    Zhang, Tao
    Zhao, Jishen
    Liu, Yongpan
    Wang, Yu
    Xie, Yuan
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
  • [12] RRAM Crossbar Array With Cell Selection Device: A Device and Circuit Interaction Study
    Deng, Yexin
    Huang, Peng
    Chen, Bing
    Yang, Xiaolin
    Gao, Bin
    Wang, Juncheng
    Zeng, Lang
    Du, Gang
    Kang, Jinfeng
    Liu, Xiaoyan
    [J]. IEEE TRANSACTIONS ON ELECTRON DEVICES, 2013, 60 (02) : 719 - 726
  • [13] NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory
    Dong, Xiangyu
    Xu, Cong
    Xie, Yuan
    Jouppi, Norman P.
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2012, 31 (07) : 994 - 1007
  • [14] Enabling Scientific Computing on Memristive Accelerators
    Feinberg, Ben
    Vengalam, Uday Kumar Reddy
    Whitehair, Nathan
    Wang, Shibo
    Ipek, Engin
    [J]. 2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 367 - 382
  • [15] Making Memristive Neural Network Accelerators Reliable
    Feinberg, Ben
    Wang, Shibo
    Ipek, Engin
    [J]. 2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 52 - 65
  • [16] Ferreira AP, 2010, DES AUT TEST EUROPE, P914
  • [17] Fujiki D, 2018, ACM SIGPLAN NOTICES, V53, P1, DOI [10.1145/3173162.3173171, 10.1145/3296957.3173171]
  • [18] Gao LG, 2012, IEEE INT CONF VLSI, P87, DOI 10.1109/VLSI-SoC.2012.6379011
  • [19] Govoreanu B, 2011, 2011 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM)
  • [20] Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication
    Hu, Miao
    Strachan, John Paul
    Li, Zhiyong
    Grafals, Emmanuelle M.
    Davila, Noraica
    Graves, Catherine
    Lam, Sity
    Ge, Ning
    Yang, Jianhua
    Williams, R. Stanley
    [J]. 2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,