OpenMP extensions for FPGA Accelerators

被引:22
作者
Cabrera, Daniel [1 ,2 ]
Martorell, Xavier [1 ,2 ]
Gaydadjiev, Georgi [3 ]
Ayguade, Eduard [1 ,2 ]
Jimenez-Gonzalez, Daniel [1 ,2 ]
机构
[1] Barcelona Supercomp Ctr, C Jordi Girona 31, E-08034 Barcelona, Spain
[2] Univ Politecn Cataluna, E-08034 Barcelona, Spain
[3] Delft Univ Technol, NL-2628 CD Delft, Netherlands
来源
2009 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION, PROCEEDINGS | 2009年
关键词
D O I
10.1109/ICSAMOS.2009.5289237
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reconfigurable computing is one of the paths to explore towards low-power supercomputing. However, programming these reconfigurable devices is not an easy task and still requires significant research and development efforts to make it really productive. In addition, the use of these devices as accelerators in multicore, SMPs and ccNUMA architectures adds an additional level of programming complexity in order to specify the offloading of tasks to reconfigurable devices and the interoperability with current shared-memory programming paradigms such as OpenMP. This paper presents extensions to OpenMP 3.0 that try to address this second challenge and an implementation in a prototype runtime system. With these extensions the programmer can easily express the offloading of an already existing reconfigurable binary code (bitstream) hiding all the complexities related with device configuration, bitstream loading, data arrangement and movement to the device memory. Our current prototype implementation targets the SGI Altix systems with RASC blades (based on the Virtex 4 FPGA). We analyze the overheads introduced in this implementation and propose a hybrid host/device operational mode to hide some of these overheads, significantly improving the performance of the applications. A complete evaluation of the system is done with a matrix multiplication kernel, including an estimation considering different FPGA frequencies.
引用
收藏
页码:17 / +
页数:2
相关论文
共 20 条
[11]  
NAJJAR WA, 2007, CASES 07, P1
[12]  
PHAM D, 2005, IEEE INT SOL STAT CI
[13]   Larrabee: A many-core x86 architecture for visual computing [J].
Seiler, Larry ;
Carmean, Doug ;
Sprangle, Eric ;
Forsyth, Tom ;
Abrash, Michael ;
Dubey, Pradeep ;
Junkins, Stephen ;
Lake, Adam ;
Sugerman, Jeremy ;
Cavin, Robert ;
Espasa, Roger ;
Grochowski, Ed ;
Juan, Toni ;
Hanrahan, Pat .
ACM TRANSACTIONS ON GRAPHICS, 2008, 27 (03)
[14]  
*T P GROUP, PGI FORTR C ACC COMP
[15]  
VASSILIADIS S, 2003, P 3 INT WORKSH SYST, P1
[16]   OpenFPGA CoreLib core library interoperability effort [J].
Wirthlin, M. ;
Poznanovic, D. ;
Sundararajan, P. ;
Coppola, A. ;
Pellerin, D. ;
Najjar, W. ;
Bruce, R. ;
Babst, M. ;
Pritchard, O. ;
Palazzari, P. ;
Kuzmanov, G. .
PARALLEL COMPUTING, 2008, 34 (4-5) :231-244
[17]  
SGI RASC GUIDE
[18]  
2009, OPENMP API SPECIFICA
[19]  
MITRION CC
[20]  
SGI ALTIX 4700