Analysing software prefetching opportunities in hardware transactional memory

被引:0
作者
Shimchenko, Marina [1 ]
Titos-Gil, Ruben [2 ]
Fernandez-Pascual, Ricardo [2 ]
Acacio, Manuel E. [2 ]
Kaxiras, Stefanos [1 ]
Ros, Alberto [2 ]
Jimborean, Alexandra [1 ,2 ]
机构
[1] Uppsala Univ, Dept Comp Syst, Uppsala, Sweden
[2] Univ Murcia, Comp Engn Dept, Murcia, Spain
基金
瑞典研究理事会; 欧洲研究理事会;
关键词
Hardware transactional memory; Parallel programming; Compiler; Software prefetching;
D O I
10.1007/s11227-021-03897-z
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hardware transactional memory emerged to make parallel programming more accessible. However, the performance pitfall of this technique is squashing speculatively executed instructions and re-executing them in case of aborts, ultimately resorting to serialization in case of repeated conflicts. A significant fraction of aborts occurs due to conflicts (concurrent reads and writes to the same memory location performed by different threads). Our proposal aims to reduce conflict aborts by reducing the window of time during which transactional regions can suffer conflicts. We achieve this by using software prefetching instructions inserted automatically at compile-time. Through these prefetch instructions, we intend to bring the necessary data for each transaction from the main memory to the cache before the transaction itself starts to execute, thus converting the otherwise long latency cache misses into hits during the execution of the transaction. The obtained results show that our approach decreases the number of aborts by 30% on average and improves performance by up to 19% and 10% for two out of the eight evaluated benchmarks. We provide insights into when our technique is beneficial given certain characteristics of the transactional regions, the advantages and disadvantages of our approach, and finally, discuss potential solutions to overcome some of its limitations.
引用
收藏
页码:919 / 944
页数:26
相关论文
共 34 条
[1]  
Ansari M, 2010, LECT NOTES COMPUT SC, V5952, P35, DOI 10.1007/978-3-642-11515-8_5
[2]  
Ansari M, 2009, LECT NOTES COMPUT SC, V5409, P4
[3]  
Binkert Nathan, 2011, Computer Architecture News, V39, P1, DOI 10.1145/2024716.2024718
[4]  
Dash A, 2010, LECT NOTES COMPUT SC, V6452, P355, DOI 10.1007/978-3-642-16955-7_18
[5]   Integrating Caching and Prefetching Mechanisms in a Distributed Transactional Memory [J].
Dash, Alokika ;
Demsky, Brian .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (08) :1284-1298
[6]   Improving Parallelism in Hardware Transactional Memory [J].
Dice, Dave ;
Herlihy, Maurice ;
Kogan, Alex .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (01)
[7]   Seer: Probabilistic Scheduling for Hardware Transactional Memory [J].
Diegues, Nuno ;
Romano, Paolo ;
Garbatov, Stoyan .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2017, 35 (03)
[8]  
Diegues N, 2014, ACM SIGPLAN NOTICES, V49, P167, DOI [10.1145/2555243.2555259, 10.1145/2692916.2555259]
[9]  
Dragojevic A., 2010, 5 ACM SIGPLAN WORKSH
[10]  
Harris T., 2010, T MEMORY, V2