Migration-Aware Loop Retiming for STT-RAM-Based Hybrid Cache in Embedded Systems

被引:9
作者
Qiu, Keni [1 ]
Zhao, Mengying [1 ]
Li, Qingan [1 ]
Fu, Chenchen [1 ]
Xue, Chun Jason [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon 999077, Hong Kong, Peoples R China
关键词
Embedded systems; energy; hybrid cache; loop retiming; migration; STT-RAM; ARCHITECTURE;
D O I
10.1109/TCAD.2013.2288692
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently hybrid cache architecture consisting of both spin-transfer torque RAM (STT-RAM) and SRAM has been proposed for energy efficiency. In hybrid caches, migration-based techniques have been proposed. A migration technique dynamically moves write-intensive and read-intensive data between STT-RAM and SRAM to explore the advantages of hybrid cache. Meanwhile, migrations also introduce extra reads and writes during data movements. For stencil loops with read and write data dependencies, we observe that migration overhead is significant, and migrations closely correlate to the interleaved read and write memory access pattern in a memory block. This paper proposes a loop retiming framework during compilation to reduce the migration overhead by changing the interleaved memory access pattern. With the proposed loop retiming technique, the interleaved memory accesses can be significantly reduced so that migration overhead is mitigated, and energy efficiency of hybrid cache is significantly improved. The experimental results have shown that, with the proposed methods, on average, the migration number is reduced up to 27.1% and the cache dynamic energy is reduced up to 14.0%.
引用
收藏
页码:329 / 342
页数:14
相关论文
共 22 条
  • [1] [Anonymous], 2013, LIVERMORE
  • [2] Scheduling data-flow graphs via retiming and unfolding
    Chao, LF
    Sha, EHM
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1997, 8 (12) : 1259 - 1267
  • [3] Design Margin Exploration of Spin-Transfer Torque RAM (STT-RAM) in Scaled Technologies
    Chen, Yiran
    Wang, Xiaobin
    Li, Hai
    Xi, Haiwen
    Yan, Yuan
    Zhu, Wenzhong
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (12) : 1724 - 1734
  • [4] Write Activity Reduction on Non-Volatile Main Memories for Embedded Chip Multiprocessors
    Hu, Jingtong
    Xue, Chun Jason
    Zhuge, Qingfeng
    Tseng, Wei-Che
    Sha, Edwin H. -M.
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2013, 12 (03)
  • [5] Hu JT, 2010, DES AUT CON, P350
  • [6] Jiang LY, 2012, PROCEEDINGS OF THE ASME INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE, 2012, P907
  • [7] Jianhua Li, 2011, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, P31, DOI 10.1109/VLSISoC.2011.6081626
  • [8] Lattner C, 2004, INT SYM CODE GENER, P75, DOI 10.1109/CGO.2004.1281665
  • [9] LEISERSON CE, 1991, ALGORITHMICA, V6, P5, DOI 10.1007/BF01759032
  • [10] Li Q., 2012, Proceedings of the ACM/IEEE international symposium on Low power electronics and design, P351