An access pattern based adaptive mapping function for GPGPU scratchpad memory

被引:0
作者
Han, Feng [1 ]
Li, Li [1 ]
Wang, Kun [1 ]
Feng, Fan [1 ]
Pan, Hongbing [1 ]
Sha, Jin [1 ]
Lin, Jun [1 ]
机构
[1] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China
基金
高等学校博士学科点专项科研基金;
关键词
GPGPU; scratchpad memory; adaptive mapping function; bank conflict reduction;
D O I
10.1587/elex.14.20170373
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As modern GPUs integrate massive processing elements and limited memories on-chip, the efficiency of using their scratchpad memories becomes important for performance and energy. To meet bandwidth requirement of simultaneously accessing of a thread array, multi-bank design, dividing a scratchpad memory into equally-sized memory modules, are widely used. However, the complex access patterns in real-world applications can cause the bank conflicts which comes from different threads accessing the same bank at the same time, and the conflicts hinder the performance sharply. A mapping function is a method that redistributes the accesses according to access addresses. To reduce bank conflicts some scratchpad memory mapping functions are exploited, such as XOR based hash functions and configurable functions. In this paper, we propose an adaptive mapping function, which can dynamically select a suitable mapping function for applications based on the statistics of first block executing. The experimental results show that 94.8 percent bank conflicts reduced and 1.235x performance improved for 17 benchmarks on GPGPU-sim, a Fermi-like simulator.
引用
收藏
页数:12
相关论文
共 9 条
[1]  
Bakhoda A, 2009, INT SYM PERFORM ANAL, P163, DOI 10.1109/ISPASS.2009.4919648
[2]  
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
[3]  
Fang JB, 2014, SCI PROGRAMMING-NETH, V22, P239, DOI [10.1155/2014/623841, 10.3233/SPR-140390]
[4]  
Givargis T, 2003, DES AUT CON, P875
[5]   Reducing conflict misses by application-specific reconfigurable indexing [J].
Patel, Kimish ;
Benini, Luca ;
Macii, Enrico ;
Poncino, Massimo .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2006, 25 (12) :2626-2637
[6]   Eliminating Conflicts in a Multilevel Cache Using XOR-based Placement Techniques [J].
Salwan, Hemant .
2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, :198-203
[7]  
Stratton J. A., 2012, PARBOIL REVISED BENC, P1048, DOI [10.1109/IPDPSW.2012.128, DOI 10.1109/IPDPSW.2012.128]
[8]   Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs [J].
van den Braak, Gert-Jan ;
Gomez-Luna, Juan ;
Maria Gonzalez-Linares, Jose ;
Corporaal, Henk ;
Guil, Nicolas .
IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (07) :2045-2058
[9]   XOR-based hash functions [J].
Vandierendonck, H ;
De Bosschere, K .
IEEE TRANSACTIONS ON COMPUTERS, 2005, 54 (07) :800-812