Enhancing Design Space Exploration by Extending CPU/GPU Specifications onto FPGAs

被引:7
作者
Owaida, Muhsen [1 ]
Falcao, Gabriel [2 ]
Andrade, Joao [2 ]
Antonopoulos, Christos [3 ]
Bellas, Nikolaos [3 ]
Purnaprajna, Madhura [1 ]
Novo, David [1 ]
Karakonstantis, Georgios [4 ]
Burg, Andreas [5 ]
Ienne, Paolo [1 ]
机构
[1] Ecole Polytech Fed Lausanne, Architecture Lab LAP, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland
[2] Univ Coimbra, Fac Sci & Technol, Inst Telecomunicacoes, Dept Elect & Comp Engn, P-3030290 Coimbra, Portugal
[3] Univ Thessaly, Dept Elect & Comp Engn, Thessaly, Greece
[4] Queens Univ Belfast, Elect Engn & Comp Sci, Sch Elect, Belfast BT7 1NN, Antrim, North Ireland
[5] Ecole Polytech Fed Lausanne, Telecommun Circuit Lab, CH-1015 Lausanne, Switzerland
关键词
Design; Algorithms; Performance; Design space exploration; simulation tools; parallel computing; FPGAs; GPUs; OpenCL; LDPC;
D O I
10.1145/2656207
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.
引用
收藏
页数:23
相关论文
共 26 条
[1]  
[Anonymous], 2007, CUDA COMP UN DEV ARC
[2]  
[Anonymous], 2005, 302307V111 EN ETSI
[3]  
Canis A, 2011, FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P33
[4]   Performance Comparison of Graphics Processors to Reconfigurable Logic: A Case Study [J].
Cope, Ben ;
Cheung, Peter Y. K. ;
Luk, Wayne ;
Howes, Lee .
IEEE TRANSACTIONS ON COMPUTERS, 2010, 59 (04) :433-448
[5]   DVB-S2 low density parity check codes with near Shannon limit performance [J].
Eroz, M ;
Sun, FW ;
Lee, LN .
INTERNATIONAL JOURNAL OF SATELLITE COMMUNICATIONS AND NETWORKING, 2004, 22 (03) :269-279
[6]   GPU-based DVB-S2 LDPC decoder with high throughput and fast error floor detection [J].
Falcao, G. ;
Andrade, J. ;
Silva, V. ;
Sousa, L. .
ELECTRONICS LETTERS, 2011, 47 (09) :542-543
[7]   Portable LDPC Decoding on Multicores Using OpenCL [J].
Falcao, Gabriel ;
Silva, Vitor ;
Sousa, Leonel ;
Andrade, Joao .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (04) :81-+
[8]   LOW-DENSITY PARITY-CHECK CODES [J].
GALLAGER, RG .
IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (01) :21-&
[9]   Using Functional Programming to Generate an LDPC Forward Error Corrector [J].
Gill, Andy ;
Bull, Tristan ;
DePardo, Dan ;
Farmer, Andrew ;
Komp, Ed ;
Perrins, Erik .
2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, :133-140
[10]  
Group Khronos, 2010, OPENCL OP STAND PAR