Generating High-Performance FPGA Accelerator Designs for Big Data Analytics with Fletcher and Apache Arrow

被引:1
|
作者
Peltenburg, Johan [1 ]
van Straten, Jeroen [1 ]
Brobbel, Matthijs [1 ]
Al-Ars, Zaid [1 ]
Hofstee, H. Peter [1 ,2 ]
机构
[1] Delft Univ Technol, Delft, Netherlands
[2] IBM Corp, Austin, TX USA
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2021年 / 93卷 / 05期
关键词
FPGA; Accelerator; Big data; Analytics; Fletcher; Apache Arrow;
D O I
10.1007/s11265-021-01650-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA accelerators for big data analytics pipelines. On the software side, we observe complex run-time systems, hardware-unfriendly in-memory layouts of data sets, and (de)serialization overhead. On the hardware side, we observe a relative lack of platform-agnostic open-source tooling, a high design effort for data structure-specific interfaces, and a high design effort for infrastructure. The open source Fletcher framework addresses these challenges. It is built on top of Apache Arrow, which provides a common, hardware-friendly in-memory format to allow zero-copy communication of large tabular data, preventing (de)serialization overhead. Fletcher adds FPGA accelerators to the list of over eleven supported software languages. To deal with the hardware challenges, we present Arrow-specific components, providing easy-to-use, high-performance interfaces to accelerated kernels. The components are combined based on a generic architecture that is specialized according to the application through an extensive infrastructure generation framework that is presented in this article. All generated hardware is vendor-agnostic, and software drivers add a platform-agnostic layer, allowing users to create portable implementations.
引用
收藏
页码:565 / 586
页数:22
相关论文
共 50 条
  • [1] Generating High-Performance FPGA Accelerator Designs for Big Data Analytics with Fletcher and Apache Arrow
    Johan Peltenburg
    Jeroen van Straten
    Matthijs Brobbel
    Zaid Al-Ars
    H. Peter Hofstee
    Journal of Signal Processing Systems, 2021, 93 : 565 - 586
  • [2] HIGH-PERFORMANCE COMPUTING BASED BIG DATA ANALYTICS FOR SMART MANUFACTURING
    Yang, Yuhang
    Cai, Y. Dora
    Lu, Qiyue
    Zhang, Yifang
    Koric, Seid
    Shao, Chenhui
    PROCEEDINGS OF THE ASME 13TH INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE, 2018, VOL 3, 2018,
  • [3] Optimized load balancing in high-performance computing for big data analytics
    Mirtaheri, Seyedeh Leili
    Grandinetti, Lucio
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (16)
  • [4] High-Performance Geometric Algorithms for Sparse Computation in Big Data Analytics
    Baumann, Philipp
    Hochbaum, Dorit S.
    Spaen, Quico
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 546 - 555
  • [5] Performance Comparison Between Apache Hive and Oracle SQL for Big Data Analytics
    Sethy, Rotsnarani
    Dash, Santosh Kumar
    Panda, Mrutyunjaya
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2016), 2018, 614 : 130 - 141
  • [6] FPGA-based hardware accelerator for high-performance data-stream processing
    Lysakov K.F.
    Shadrin M.Y.
    Pattern Recognition and Image Analysis, 2013, 23 (1) : 26 - 34
  • [7] FPGA-Based High-Performance Data Compression Deep Neural Network Accelerator
    Wang, Hanze
    Fu, Yingxun
    Ma, Li
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 563 - 569
  • [8] Transforming medical sciences with high-performance computing, high-performance data analytics and AI
    Lewandowski, Natalie
    Koller, Bastian
    TECHNOLOGY AND HEALTH CARE, 2023, 31 (04) : 1505 - 1507
  • [9] Contributions to High-Performance Big Data Computing
    Fox, Geoffrey
    Qiu, Judy
    Crandall, David
    Von Laszewski, Gregor
    Beckstein, Oliver
    Paden, John
    Paraskevakos, Ioannis
    Jha, Shantenu
    Wang, Fusheng
    Marathe, Madhav
    Vullikanti, Anil
    Cheatham, Thomas
    FUTURE TRENDS OF HPC IN A DISRUPTIVE SCENARIO, 2019, 34 : 34 - 81
  • [10] Predictive Analytics on Genomic Data with High-Performance Computing
    Leung, Carson K.
    Sarumi, Oluwafemi A.
    Zhang, Christine Y.
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2187 - 2194