An Integrated Hardware-Software Approach to Flexible Transactional Memory

被引:0
作者
Shriraman, Arrvindh [1 ]
Spear, Michael F. [1 ]
Hossain, Hemayet [1 ]
Marathe, Virendra J. [1 ]
Dwarkadas, Sandhya [1 ]
Scott, Michael L. [1 ]
机构
[1] Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA
来源
ISCA'07: 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, CONFERENCE PROCEEDINGS | 2007年
关键词
Transactional memory; Cache coherence; Multiprocessors; RSTM;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There has been considerable recent interest in both hardware and software transactional memory (TM). We present an intermediate approach, in which hardware serves to accelerate a TM implementation controlled fundamentally by software. Specifically, we describe an alert on update mechanism (AOU) that allows a thread to receive fast, asynchronous notification when previously-identified lines are written by other threads, and a programmable data isolation mechanism (PDI) that allows a thread to hide its speculative writes from other threads, ignoring conflicts, until software decides to make them visible. These mechanisms reduce bookkeeping, validation, and copying overheads without constraining software policy on a host of design decisions. We have used AOU and PDI to implement a hardware-accelerated software transactional memory system we call RTM. We have also used AOU alone to create a simpler "RTM-Lite". Across a range of microbenchmarks, RTM outperforms RSTM, a publicly available software transactional memory system, by as much as 8.7X (geometric mean of 3.5X) in single-thread mode. At 16 threads, it outperforms RSTM by as much as 5X, with an average speedup of 2X. Performance degrades gracefully when transactions overflow hardware structures. RTM-Lite is slightly faster than RTM for transactions that modify only small objects; fill RTM is significantly faster when objects are large. In a strong argument for policy flexibility, we find that the choice between eager (first-access) and lazy (commit-time) conflict detection can lead to significant performance differences in both directions, depending on application characteristics.
引用
收藏
页码:104 / 115
页数:12
相关论文
共 30 条
[1]  
BLUNDELL C, 2006, ACM SIGARCH COMPUTER, V5
[2]  
DAMRON A, 2006, P 12 INT C ARCH SUPP
[3]  
FRASER K, 2004, CONCURRENT PROG UNPU
[4]  
GUERRAOUI R, 2005, P 19 INT S DISTR COM
[5]  
Hammond L., 2004, P 31 INT S COMP ARCH
[6]  
HERLIHY M, 1992, P 20 INT S COMP ARCH
[7]  
HERLIHY MP, 2003, P 22 ACM S PRINC DIS
[8]  
KUMAR S, 2006, P 11 ACM S PRINC PRA
[9]  
Larus J. R., 2007, SYNTHESIS LECT COMPU
[10]  
LIE S., 2005, P 11 INT S HIGH PERF