Analysing software prefetching opportunities in hardware transactional memory

被引:0
作者
Marina Shimchenko
Rubén Titos-Gil
Ricardo Fernández-Pascual
Manuel E. Acacio
Stefanos Kaxiras
Alberto Ros
Alexandra Jimborean
机构
[1] Uppsala University,Department of Computing Systems
[2] University of Murcia,Computer Engeneering Department
来源
The Journal of Supercomputing | 2022年 / 78卷
关键词
Hardware transactional memory; Parallel programming; Compiler; Software prefetching;
D O I
暂无
中图分类号
学科分类号
摘要
Hardware transactional memory emerged to make parallel programming more accessible. However, the performance pitfall of this technique is squashing speculatively executed instructions and re-executing them in case of aborts, ultimately resorting to serialization in case of repeated conflicts. A significant fraction of aborts occurs due to conflicts (concurrent reads and writes to the same memory location performed by different threads). Our proposal aims to reduce conflict aborts by reducing the window of time during which transactional regions can suffer conflicts. We achieve this by using software prefetching instructions inserted automatically at compile-time. Through these prefetch instructions, we intend to bring the necessary data for each transaction from the main memory to the cache before the transaction itself starts to execute, thus converting the otherwise long latency cache misses into hits during the execution of the transaction. The obtained results show that our approach decreases the number of aborts by 30% on average and improves performance by up to 19% and 10% for two out of the eight evaluated benchmarks. We provide insights into when our technique is beneficial given certain characteristics of the transactional regions, the advantages and disadvantages of our approach, and finally, discuss potential solutions to overcome some of its limitations.
引用
收藏
页码:919 / 944
页数:25
相关论文
共 43 条
[1]  
Binkert N(2011)The gem5 simulator Comput Arch News 39 1-7
[2]  
Beckmann B(2011)Integrating caching and prefetching mechanisms in a distributed transactional memory IEEE Trans Parallel Distrib Syst 22 1284-1298
[3]  
Black G(2018)Improving parallelism in hardware transactional memory ACM Trans Arch Code Optim 15 1-24
[4]  
Reinhardt SK(2017)Seer: probabilistic scheduling for hardware transactional memory ACM Trans Comput Syst 35 1-41
[5]  
Saidi A(2015)Transactional memory support in the IBM POWER8 processor IBM J Res Dev 59 8:1-8:14
[6]  
Basu A(2013)An evaluation of intel’s restricted transactional memory for cpas Commun Process Arch 2013 271-292
[7]  
Hestness J(2014)Detecting memory leaks statically with full-sparse value-flow analysis IEEE Trans Softw Eng 40 107-122
[8]  
Hower DR(2020)PfTouch: Concurrent page-fault handling for Intel restricted transactional memory J Parallel Distrib Comput 145 111-123
[9]  
Krishna T(1984)Program slicing IEEE Trans Softw Eng 10 352-357
[10]  
Sardashti S(undefined)undefined undefined undefined undefined-undefined