Data-driven HLS optimization for reconfigurable accelerators

被引:0
作者
Ferikoglou, Aggelos [1 ]
Kakolyris, Andreas [1 ]
Kypriotis, Vasilis [1 ]
Masouros, Dimosthenis [1 ]
Soudris, Dimitrios [1 ]
Xydis, Sotirios [1 ]
机构
[1] Natl Tech Univ Athens, Athens, Greece
来源
PROCEEDINGS OF THE 61ST ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2024 | 2024年
关键词
High-Level Synthesis (HLS); Design Space Exploration (DSE); FPGA Accelerators; Auto-tuning; Data-driven Optimization; DESIGN;
D O I
10.1145/3649329.3658471
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-Level Synthesis (HLS) has played a pivotal role in making FPGAs accessible to a broader audience by facilitating high-level device programming and rapid microarchitecture customization through the use of directives. However, manually selecting the right directives can be a formidable challenge for programmers lacking a hardware background. This paper introduces an ultra-fast, knowledge-based HLS design optimization method that automatically extracts and applies the most promising directive configurations to the original source code. This optimization approach is entirely data-driven, offering a generalized HLS tuning solution without reliance on Quality of Result (QoR) models or meta-heuristics. We design, implement, and evaluate our methodology using over 100 applications sourced from well-established benchmark suites and GitHub repositories, all running on a Xilinx ZCU104 FPGA. The results are promising, including an average geometric mean speedup of x7.2 and x1.35 compared to designer-optimized designs and resource over-provisioning strategies, respectively. Additionally, it demonstrates a high design feasibility score and maintains an average inference latency of 38ms. Comparative analysis with traditional genetic algorithm-based Design Space Exploration (DSE) methods and State-of-the-Art (SoA) approaches reveals that it produces designs of similar quality but at speeds 2-3 orders of magnitude faster. This suggests that it is a highly promising solution for ultra-fast and automated HLS optimization.
引用
收藏
页数:6
相关论文
共 22 条
[1]  
Andrew Putnam, Retrospective: A Reconfigurable Fabric for Acceler- ating Large-Scale Datacenter Services
[2]  
Che SA, 2009, I S WORKL CHAR PROC, P44, DOI 10.1109/IISWC.2009.5306797
[3]  
Cong Jason, 2016, FPGAs for Software Programmers, P137
[4]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[5]   Graph Neural Networks for High-Level Synthesis Design Space Exploration [J].
Ferretti, Lorenzo ;
Cini, Andrea ;
Zacharopoulos, Georgios ;
Alippi, Cesare ;
Pozzi, Laura .
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (02)
[6]   Leveraging Prior Knowledge for Effective Design-Space Exploration in High-Level Synthesis [J].
Ferretti, Lorenzo ;
Kwon, Jihye ;
Ansaloni, Giovanni ;
Di Guglielmo, Giuseppe ;
Carloni, Luca P. ;
Pozzi, Laura .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (11) :3736-3747
[7]   Cluster-Based Heuristic for High Level Synthesis Design Space Exploration [J].
Ferretti, Lorenzo ;
Ansaloni, Giovanni ;
Pozzi, Laura .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (01) :35-43
[8]   Sherlock: A Multi-Objective Design Space Exploration Framework [J].
Gautier, Quentin ;
Althoff, Alric ;
Crutchfield, Christopher L. ;
Kastner, Ryan .
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (04)
[9]  
Hao Yu Cody, 2018, P 55 ANN DES AUT C, P1
[10]  
Kemker R, 2018, AAAI CONF ARTIF INTE, P3390