Efficient Loop Scheduling for Chip Multiprocessors with Non-Volatile Main Memory

被引:4
作者
Du, Jiayi [1 ]
Wang, Yan [1 ]
Zhuge, Qingfeng [3 ]
Hu, Jingtong [2 ]
Sha, Edwin H. -M. [2 ,3 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA
[3] Chongqing Univ, Coll Comp Sci, Chongqing 40044, Peoples R China
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2013年 / 71卷 / 03期
基金
美国国家科学基金会;
关键词
Non-volatile memory; Loop scheduling algorithm; Chip multiprocessor;
D O I
10.1007/s11265-012-0703-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Non-volatile memories (NVMs) show great potential in replacing DRAM as the main memory in many embedded systems because of their attractive characteristics such as low cost, high density, and low energy consumption. However, the problem of asymmetric read and write costs has to be addressed before the advantages of NVM can be fully exploited. That is, the cost of write operation is much more expensive than the cost of read operation on NVMs. The existing techniques for loop optimization cannot be used effectively with non-volatile main memory because this special feature is not considered. In this paper, we propose an efficient loop scheduling algorithm, the Rotation with Maximum Bipartite Matching (RMBM) algorithm, to address the problem of expensive write operations on non-volatile main memory for chip multiprocessors (CMPs). It achieves high parallelism for a loop and, at the same time, reduces the number of write operations on NVM. The experimental results show that the RMBM algorithm reduces the number of write activities on NVM by 34.5 % on average compared with the traditional rotation scheduling algorithm. The execution time is reduced by 20.5 %, and the energy consumption is also reduced by 15.03 % on average using the RMBM algorithm. In other words, the average lifetime of NVM can be extended by more than 2 times using the proposed technique.
引用
收藏
页码:261 / 273
页数:13
相关论文
共 21 条
[1]  
[Anonymous], 2003, ACM T EMBED COMPUT S, DOI DOI 10.1145/950162.950168
[2]  
Chao L.F., 1993, THESIS PRINCETON U U
[3]   Rotation scheduling: A loop pipelining algorithm [J].
Chao, LF ;
LaPaugh, AS ;
Sha, EHM .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1997, 16 (03) :229-239
[4]  
Cormen T.H., 2002, INTRO ALGORITHMS, V2nd
[5]  
Duo Liu, 2011, Proceedings of the 2011 IEEE 32nd Real-Time Systems Symposium (RTSS 2011), P357, DOI 10.1109/RTSS.2011.40
[6]   Power efficient processor architecture and the cell processor [J].
Hofstee, HP .
11TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, :258-262
[7]  
Hu JT, 2010, DES AUT CON, P350
[8]   Reducing off-chip memory access costs using data recomputation in embedded chip multi-processors [J].
Koc, Hakduran ;
Kandemir, Mahmut ;
Ercanli, Ehat ;
Ozturk, Ozcan .
2007 44TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2007, :224-+
[9]  
Leiserson C. E., 1983, Third Caltech Conference on Very Large Scale Integration, P87
[10]  
LEISERSON CE, 1991, ALGORITHMICA, V6, P5, DOI 10.1007/BF01759032