A Dynamically Adaptive Approach for Speculative Loop Execution in SMT Architectures

被引:2
作者
Li, Meirong [1 ]
Zhao, Yinliang [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Peoples R China
来源
2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年
关键词
Simultaneous multithreading; Performance prediction; Loop-level parallelism; Thread-level speculation;
D O I
10.1109/HPCC.2014.171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Simultaneous multithreading allows the exploitation of thread-level speculation on the same processor. Due to the contention for shared processor resources, the performance of speculative threads often suffers from the potential of inter-thread interference, which is hard to be statically estimated by the compiler. Thus we propose an approach to dynamically determine and extract speculative threads from parallel regions until runtime. It relies on a cycle counter architecture to collect the performance profiles of each parallelized loop and uncover the potential of loop-level parallelism. These performance profiles are obtained from the relative single-threaded execution time prediction for speculative threads using thread execution cycle breakdown. The performance of different loop levels is dynamically evaluated by the prediction and only the best loop level will be chosen to parallelize. Several performance tuning policies are also examined. The best policy can achieve an average speedup of 1.45 using SPEC CPU2000 benchmarks, and it outperforms the static loop selection by 33%.
引用
收藏
页码:1024 / 1031
页数:8
相关论文
共 20 条
  • [1] A dynamic multithreading processor
    Akkary, H
    Driscoll, MA
    [J]. 31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, : 226 - 236
  • [2] A general compiler framework for speculative multithreaded processors
    Bhowmik, A
    Franklin, M
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (08) : 713 - 724
  • [3] A compiler cost model for speculative parallelization
    Dou, Jialin
    Cintra, Marcelo
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2007, 4 (02) : 12
  • [4] A cost-driven compilation framework for speculative parallelization of sequential programs
    Du, ZH
    Lim, CC
    Li, XF
    Yang, C
    Zhao, QY
    Ngai, TF
    [J]. ACM SIGPLAN NOTICES, 2004, 39 (06) : 71 - 81
  • [5] Eyerman S., 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), P145, DOI 10.1109/ISPASS.2012.6189221
  • [6] A performance counter architecture for computing accurate CPI components
    Eyerman, Stijn
    Eeckhout, Lieven
    Karkhanis, Tejas
    Smith, James E.
    [J]. ACM SIGPLAN NOTICES, 2006, 41 (11) : 175 - 184
  • [7] Per-Thread Cycle Accounting in SMT Processors
    Eyerman, Stijn
    Eeckhout, Lieven
    [J]. ACM SIGPLAN NOTICES, 2009, 44 (03) : 133 - 144
  • [8] SEED: A Statically Greedy and Dynamically Adaptive Approach for Speculative Loop Execution
    Gao, Lin
    Li, Lian
    Xue, Jingling
    Yew, Pen-Chung
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (05) : 1004 - 1016
  • [9] Huang CC, 2013, IEEE INT WORKS INFOR, P1, DOI 10.1109/WIFS.2013.6707785
  • [10] Min-cut program decomposition for thread-level speculation
    Johnson, TA
    Eigenmann, R
    Vijaykumar, TN
    [J]. ACM SIGPLAN NOTICES, 2004, 39 (06) : 59 - 70