STT-RAM Cache Hierarchy With Multiretention MTJ Designs

被引:41
作者
Sun, Zhenyu [1 ]
Bi, Xiuyuan [1 ]
Li, Hai [1 ]
Wong, Weng-Fai [2 ]
Zhu, Xiaochun [3 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15261 USA
[2] Natl Univ Singapore, Singapore 119077, Singapore
[3] Qualcomm Inc, Qualcomm CDMA Technol, San Diego, CA 92121 USA
基金
美国国家科学基金会;
关键词
Cache hierarchy; magnetic tunnel junction (MTJ); retention time; spin-transfer torque random access memory (STT-RAM); spintronic memristor; switching current; ARCHITECTURE; CIRCUIT; WRITE; POWER; BODY; MRAM;
D O I
10.1109/TVLSI.2013.2267754
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Spin-transfer torque random access memory (STT-RAM) is the most promising candidate to be universal memory due to its good scalability, zero standby power, and radiation hardness. Having a cell area only 1/9 to 1/3 that of SRAM, allows for a much larger cache with the same die footprint. Such reduction of cell size can significantly shrink the cache array size, leading to significant improvement of overall system performance and power consumption, especially in this multicore era where locality is crucial. However, deploying STT-RAM technology in L1 caches is challenging because write operations on STT-RAM are slow and power-consuming. In this paper, we propose a range of cache hierarchy designs implemented entirely using STT-RAM that deliver optimal power saving and performance. In particular, our designs use STT-RAM cells with various data retention times and write performances, made possible by novel magnetic tunneling junction designs. For L1 caches where speed is of utmost importance, we propose a scheme that uses fast STT-RAM cells with reduced data retention time coupled with a dynamic refresh scheme. In the dynamic refresh scheme, another emerging technology, memristor, is used as the counter to monitor the data retention of the low-retention STT-RAM, achieving a higher array area efficiency than an SRAM-based counter. For lower level caches with relatively larger cache capacities, we propose a design that has partitions of different retention characteristics, and a data migration scheme that moves data between these partitions. The experiments show that on the average, our proposed multiretention level STT-RAM cache reduces total energy by as much as 30%-74.2% compared to previous single retention level STT-RAM caches, while improving instruction per cycle performance for both two-level and three-level cache hierarchies.
引用
收藏
页码:1281 / 1293
页数:13
相关论文
共 28 条
[1]  
[Anonymous], APPL PHYS LETT
[2]  
[Anonymous], J APPL PHYS
[3]  
[Anonymous], P 16 AS S PAC DES AU
[4]  
[Anonymous], 2008, Cacti
[5]  
Barth J., 2010, International Solid-State Circuits Conference, P342
[6]   New paradigm of predictive MOSFET and interconnect modeling for early circuit simulation [J].
Cao, Y ;
Sato, T ;
Orshansky, M ;
Sylvester, D ;
Hu, CM .
PROCEEDINGS OF THE IEEE 2000 CUSTOM INTEGRATED CIRCUITS CONFERENCE, 2000, :201-204
[7]   MEMRISTOR - MISSING CIRCUIT ELEMENT [J].
CHUA, LO .
IEEE TRANSACTIONS ON CIRCUIT THEORY, 1971, CT18 (05) :507-+
[8]  
De Sandre Guido, 2010, 2010 IEEE International Solid-State Circuits Conference (ISSCC), P268, DOI 10.1109/ISSCC.2010.5433911
[9]  
Desikan R., 2008, On-chip MRAM as a High-Bandwidth Low-Latency Replacement for DRAM Physical Memories
[10]   Spin-transfer torque switching in magnetic tunnel junctions and spin-transfer torque random access memory [J].
Diao, Zhitao ;
Li, Zhanjie ;
Wang, Shengyuang ;
Ding, Yunfei ;
Panchula, Alex ;
Chen, Eugene ;
Wang, Lien-Chang ;
Huai, Yiming .
JOURNAL OF PHYSICS-CONDENSED MATTER, 2007, 19 (16)