A Dynamically Adaptive Approach for Speculative Loop Execution in SMT Architectures

被引：2

作者：

Li, Meirong ^{[1
]}

Zhao, Yinliang ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Peoples R China

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年

关键词：

Simultaneous multithreading; Performance prediction; Loop-level parallelism; Thread-level speculation;

D O I：

10.1109/HPCC.2014.171

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Simultaneous multithreading allows the exploitation of thread-level speculation on the same processor. Due to the contention for shared processor resources, the performance of speculative threads often suffers from the potential of inter-thread interference, which is hard to be statically estimated by the compiler. Thus we propose an approach to dynamically determine and extract speculative threads from parallel regions until runtime. It relies on a cycle counter architecture to collect the performance profiles of each parallelized loop and uncover the potential of loop-level parallelism. These performance profiles are obtained from the relative single-threaded execution time prediction for speculative threads using thread execution cycle breakdown. The performance of different loop levels is dynamically evaluated by the prediction and only the best loop level will be chosen to parallelize. Several performance tuning policies are also examined. The best policy can achieve an average speedup of 1.45 using SPEC CPU2000 benchmarks, and it outperforms the static loop selection by 33%.

引用

页码：1024 / 1031

页数：8

共 20 条

[1] A dynamic multithreading processor
Akkary, H
Driscoll, MA
[J]. 31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, : 226 - 236
[2] A general compiler framework for speculative multithreaded processors
Bhowmik, A
Franklin, M
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (08) : 713 - 724
[3] A compiler cost model for speculative parallelization
Dou, Jialin
Cintra, Marcelo
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2007, 4 (02) : 12
[4] A cost-driven compilation framework for speculative parallelization of sequential programs
Du, ZH
Lim, CC
Li, XF
Yang, C
Zhao, QY
Ngai, TF
[J]. ACM SIGPLAN NOTICES, 2004, 39 (06) : 71 - 81
[5] Eyerman S., 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), P145, DOI 10.1109/ISPASS.2012.6189221
[6] A performance counter architecture for computing accurate CPI components
Eyerman, Stijn
Eeckhout, Lieven
Karkhanis, Tejas
Smith, James E.
[J]. ACM SIGPLAN NOTICES, 2006, 41 (11) : 175 - 184
[7] Per-Thread Cycle Accounting in SMT Processors
Eyerman, Stijn
Eeckhout, Lieven
[J]. ACM SIGPLAN NOTICES, 2009, 44 (03) : 133 - 144
[8] SEED: A Statically Greedy and Dynamically Adaptive Approach for Speculative Loop Execution
Gao, Lin
Li, Lian
Xue, Jingling
Yew, Pen-Chung
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (05) : 1004 - 1016
[9] Huang CC, 2013, IEEE INT WORKS INFOR, P1, DOI 10.1109/WIFS.2013.6707785
[10] Min-cut program decomposition for thread-level speculation
Johnson, TA
Eigenmann, R
Vijaykumar, TN
[J]. ACM SIGPLAN NOTICES, 2004, 39 (06) : 59 - 70

← 1 2 →