Compiler and hardware support for reducing the synchronization of speculative threads

被引:13
作者
Zhai, Antonia [1 ]
Steffan, J. Gregory [2 ]
Colohan, Christopher B. [3 ]
Mowry, Todd C. [4 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[2] Univ Toronto, Toronto, ON, Canada
[3] Google, Ann Arbor, MI 48104 USA
[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
design; experimentation; performance; thread-level speculation; chip-multiprocessing; automatic parallelization; instruction scheduling;
D O I
10.1145/1369396.1369399
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Thread-level speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this article, we focus on one important limitation of program performance under TLS, which stalls as a result of synchronizing and forwarding scalar values between speculative threads that would otherwise cause frequent data dependences and, hence, failed speculation. Using SPECint benchmarks that have been automatically transformed by our compiler to exploit TLS, we present, evaluate in detail, and compare both compiler and hardware techniques for improving the communication of scalar values. We find that through our dataflow algorithms for three increasingly aggressive instruction scheduling techniques, the compiler can drastically reduce the critical forwarding path introduced by the synchronization and forwarding of scalar values. We also show that hardware techniques for reducing synchronization can be complementary to compiler scheduling, but that the additional performance benefits are minimal and are generally not worth the cost.
引用
收藏
页码:1 / 33
页数:33
相关论文
共 52 条
  • [1] Akkary H., 1998, MICRO 31
  • [2] *AMD CORP, 2005, LEAD IND MULT TECHN
  • [3] AMMONS G, 1998, ACM SIGPLAN 98 C PRO
  • [4] Report of the European Association for Palliative Care
    Blumhuber, H
    DeConno, F
    Hanks, GW
    [J]. JOURNAL OF PAIN AND SYMPTOM MANAGEMENT, 1996, 12 (02) : 82 - 84
  • [5] CHANG PP, 1991, CRHC9129 U ILL
  • [6] CINTRA M, 2002, 8 INT S HIGH PERFORM
  • [7] CINTRA M, 2000, P ISCA 27
  • [8] COLOHAN CB, 2005, 31 INT C VER LARG DA
  • [9] COLOHAN CB, 2006, 33 ANN INT S COMP AR
  • [10] CYTRON R, 1986, INT C PAR PROC