FPGA-Based Computational Fluid Dynamics Simulation Architecture via High-Level Synthesis Design Method

被引:5
作者
Du, Changdao [1 ]
Firmansyah, Iman [1 ]
Yamaguchi, Yoshiki [1 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki 3058577, Japan
来源
APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2020 | 2020年 / 12083卷
关键词
HPC; FPGA; HLS; CFD; LATTICE BOLTZMANN METHOD; STENCIL COMPUTATION; IMPLEMENTATION;
D O I
10.1007/978-3-030-44534-8_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today's High-Performance Computing (HPC) systems often use GPUs as dedicated hardware accelerators to meet the computation requirements of applications such as neural networks, genetic decoding, and hydrodynamic simulations. Meanwhile, FPGAs have also been considered as alternative suitable hardware accelerators due to their advancing computational capabilities and low power consumption. Moreover, the developments of High-Level Synthesis (HLS) allow users to generate FPGA designs directly from mainstream languages, e.g., C, C++, and OpenCL. However, writing efficient high-level programs with good performance is still a time-consuming task, and the lack of knowledge about FPGA architecture can lead to poor scalability and portability. In this paper, we propose an architecture design for Computational Fluid Dynamics (CFD) simulations based on the HLS method. Our design can adjust the performance by utilizing the parallelism inside both temporal and spatial domains of CFD simulations. We also discuss the data reuse buffer optimization choices while considering the potability of HLS codes. A performance model is introduced to guide the design space exploration under the constraints of available resources on FPGA. We evaluate our design via a Xilinx VCU1525 FPGA board and compare the results with other state-of-the-art studies. Experiment results show that VCU1525 can achieve 629.6 GFLOP/s in D2Q9 LBM-BGK model and the design and optimization methods can be used for developing various CFD applications.
引用
收藏
页码:232 / 246
页数:15
相关论文
共 20 条
[1]   Massively parallel lattice-Boltzmann simulation of turbulent channel flow [J].
Amati, G ;
Succi, S ;
Piva, R .
INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 1997, 8 (04) :869-877
[2]  
Canis A, 2011, FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, P33
[3]   Lattice Boltzmann method for fluid flows [J].
Chen, S ;
Doolen, GD .
ANNUAL REVIEW OF FLUID MECHANICS, 1998, 30 :329-364
[4]   High-Level Synthesis for FPGAs: From Prototyping to Deployment [J].
Cong, Jason ;
Liu, Bin ;
Neuendorffer, Stephen ;
Noguera, Juanjo ;
Vissers, Kees ;
Zhang, Zhiru .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2011, 30 (04) :473-491
[5]   Optimized implementation of the Lattice Boltzmann Method on a graphics processing unit towards real-time fluid simulation [J].
Delbosc, N. ;
Summers, J. L. ;
Khan, A. I. ;
Kapur, N. ;
Noakes, C. J. .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2014, 67 (02) :462-475
[6]   Performance modeling and analysis of heterogeneous lattice Boltzmann simulations on CPU-GPU clusters [J].
Feichtinger, Christian ;
Habich, Johannes ;
Koestler, Harald ;
Ruede, Ulrich ;
Aoki, Takayuki .
PARALLEL COMPUTING, 2015, 46 :1-13
[7]  
intel.com, RAM-Based Shift Register (ALTSHIFT TAPS) IP Core
[8]   The Stratix™ 10 Highly Pipelined FPGA Architecture [J].
Lewis, David ;
Chiu, Gordon ;
Chromczak, Jeffrey ;
Galloway, David ;
Gamsa, Ben ;
Manohararajah, Valavan ;
Milton, Ian ;
Vanderhoek, Tim ;
Van Dyken, John .
PROCEEDINGS OF THE 2016 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'16), 2016, :159-168
[9]   Cellular Automata Simulations on a FPGA cluster [J].
Murtaza, S. ;
Hoekstra, A. G. ;
Sloot, P. M. A. .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2011, 25 (02) :193-204
[10]  
Obrecht C, 2011, LECT NOTES COMPUT SC, V6449, P151, DOI 10.1007/978-3-642-19328-6_16