Performance optimization for parallel systems with shared DWM via retiming, loop scheduling, and data placement

被引:3
作者
Gao, Siyuan [1 ]
Gu, Shouzhen [2 ]
Xu, Rui [1 ]
Sha, Edwin Hsing-Mean [1 ]
Zhuge, Qingfeng [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[2] East China Normal Univ, Minist Educ, Engn Res Ctr Software Hardware Codesign Technol &, Shanghai, Peoples R China
关键词
Domain wall memory; Loop scheduling; Data placement; Retiming; Shift operation;
D O I
10.1016/j.sysarc.2020.101842
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Domain Wall Memory (DWM) as an ideal candidate for replacing traditional memories especially in parallel systems, has many desirable characteristics such as low leakage power, high density and low access latency. However, due to the tape-like architecture of DWM, shift operations have a vital impact on performance. Considering data-intensive applications with massive loops and arrays, increasing parallelism of loops, appropriate loop scheduling and data placement on DWM will significantly improve the performance of parallel systems. This paper explores optimizing performance of parallel systems through retiming, loop scheduling and data placement especially when the data are arrays. It proposes Integer Linear Programming (ILP) formulation and Scheduling While Placing (SWP) algorithm to generate optimal or nearly optimal loop scheduling and data placement with minimum execution time. The experimental results show that SWP and ILP can effectively reduce execution time when compared with greedy List Scheduling First Access First Place (LF) algorithm. Besides, this paper proposes Threshold Retiming Repetition (TRR) algorithm to combine the retiming technique with SWP and ILP. The experimental results show that SWP+TRR and ILP+TRR can further reduce the execution time when compared to results without retiming.
引用
收藏
页数:10
相关论文
共 20 条
[1]   Scheduling data-flow graphs via retiming and unfolding [J].
Chao, LF ;
Sha, EHM .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1997, 8 (12) :1259-1267
[2]   Research of the dissolving capacity of molybdenite in the white matte [J].
Chen, Xingyu ;
Liu, Xuheng ;
Zhao, Zhongwei ;
Hao, Mingming .
INTERNATIONAL JOURNAL OF REFRACTORY METALS & HARD MATERIALS, 2015, 52 :1-5
[3]   After Hard Drives-What Comes Next? [J].
Kryder, Mark H. ;
Kim, Chang Soo .
IEEE TRANSACTIONS ON MAGNETICS, 2009, 45 (10) :3406-3413
[4]   MediaBench: A tool for evaluating and synthesizing multimedia and communications systems [J].
Lee, CH ;
Potkonjak, M ;
Mangione-Smith, WH .
THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, :330-335
[5]  
LEISERSON CE, 1991, ALGORITHMICA, V6, P5, DOI 10.1007/BF01759032
[6]  
Liang-Fang C., 1993, SCHEDULING BEHAV TRA
[7]   Domain Wall Memory-Layout, Circuit and Synergistic Systems [J].
Motaman, Seyedhamidreza ;
Iyengar, Anirudh Srikant ;
Ghosh, Swaroop .
IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2015, 14 (02) :282-291
[8]   Magnetic domain-wall racetrack memory [J].
Parkin, Stuart S. P. ;
Hayashi, Masamitsu ;
Thomas, Luc .
SCIENCE, 2008, 320 (5873) :190-194
[9]   AIMR: An Adaptive Page Management Policy for Hybrid Memory Architecture with NVM and DRAM [J].
Sun, Zhiwen ;
Jia, Zhiping ;
Cai, Xiaojun ;
Zhang, Zhiyong ;
Ju, Lei .
2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, :284-289
[10]  
Thomas L, 2011, 2011 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM)