ScaleHLS: A New Scalable High-Level Synthesis Framework on Multi-Level Intermediate Representation

被引：45

作者：

Ye, Hanchen ^{[1
]}

Hao, Cong ^{[2
]}

Cheng, Jianyi ^{[3
]}

Jeong, Hyunmin ^{[1
]}

Huang, Jack ^{[1
]}

Neuendorffer, Stephen ^{[4
]}

Chen, Deming ^{[1
]}

机构：

[1] Univ Illinois, Urbana, IL 61801 USA

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

[3] Imperial Coll London, London, England

[4] Xilinx Inc, San Jose, CA USA

来源：

2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022) | 2022年

关键词：

High-Level Synthesis; MLIR; Compiler; FPGA; Optimization; Design Space Exploration;

D O I：

10.1109/HPCA53966.2022.00060

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

High-level synthesis (HLS) has been widely adopted as it significantly improves the hardware design productivity and enables efficient design space exploration (DSE). Existing HLS tools are built using compiler infrastructures largely based on a single-level abstraction, such as LLVM. However, as HLS designs typically come with intrinsic structural or functional hierarchies, different HLS optimization problems are often better solved with different levels of abstractions. This paper proposes ScaIeHLS(1), a new scalable and customizable HLS framework, on top of a multi-level compiler infrastructure called MLIR. ScaleHLS represents HLS designs at multiple representation levels and provides an HLS-dedicated analysis and transform library to solve the optimization problems at the suitable levels. Using this library, we provide a DSE engine to generate optimized HLS designs automatically. In addition, we develop an HLS C front-end and a C/C++ emission back-end to translate HLS designs into/from MLIR for enabling an end-to-end compilation flow. Experimental results show that, comparing to the baseline designs without manual directives insertion and code-rewriting, that are only optimized by Xilinx Vivado HLS, ScaleHLS improves the performances with amazing quality-of-results - up to 768.1 x better on computation kernel level programs and up to 3825.0 x better on neural network models.

引用

页码：741 / 755

页数：15

共 58 条

[41]

Simonyan K, 2015, Arxiv, DOI [arXiv:1409.1556, DOI 10.48550/ARXIV.1409.1556]

[42]

Sohrabizadeh A, 2021, Arxiv, DOI arXiv:2009.14381

[43] FINN: A Framework for Fast, Scalable Binarized Neural Network Inference [J].

Umuroglu, Yaman ;

Fraser, Nicholas J. ;

Gambardella, Giulio ;

Blott, Michaela ;

Leong, Philip ;

Jahre, Magnus ;

Vissers, Kees .

FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, :65-74

[44]

Voitsechov D, 2014, CONF PROC INT SYMP C, P205, DOI 10.1109/ISCA.2014.6853234

[45] FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs [J].

Wang, Shuo ;

Liang, Yun ;

Zhang, Wei .

PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2017,

[46]

Wu N, 2021, Arxiv, DOI arXiv:2102.08138

[47]

X. Inc, 2021, VIT HLS FRONT END

[48]

X. Inc, 2020, VIT HIGH LEV SYNTH U

[49]

X. Inc, 2017, VIVADO DESIGN SUITE

[50] HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation [J].

Ye, Hanchen ;

Zhang, Xiaofan ;

Huang, Zhize ;

Chen, Gengsheng ;

Chen, Deming .

PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,

← 1 2 3 4 5 6 →