Array Partitioning Method for Streaming Dataflow Optimization in High-level Synthesis

被引：0

作者：

Hou, Renjing ^{[1
]}

Zhai, Jianwang ^{[1
]}

Wang, Yajun ^{[2
]}

Lin, Zhe ^{[3
]}

Zhao, Kang ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Integrated Circuits, Beijing, Peoples R China

[2] Tiangong Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China

[3] Sun Yat Sen Univ, Sch Integrated Circuits, Guangzhou, Guangdong, Peoples R China

来源：

2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024 | 2024年

基金：

国家重点研发计划; 北京市自然科学基金;

关键词：

High-level Synthesis; Streaming Dataflow; Array Partitioning; FPGA;

D O I：

10.1109/ISEDA62518.2024.10618042

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

High-level synthesis (HLS) is a popular method that allows designers to describe the behavior-level functionality and automatically generates efficient register-transfer level (RTL) descriptions. In HLS, dataflow is the key micro-architecture to achieve high parallelism. However, strict conditions such as sequential access on the potential channels often limit the streaming dataflow. To settle this issue, this paper proposes an efficient array partitioning method for the streaming dataflow inference. The key is to explore the potential array partitioning mode that matches the sequential access requirements by streaming channels. An experimental case study is presented on the inference of the convolutional neural networks (CNN). It indicates that the proposed method can achieve about 28.6% performance improvements compared with the default dataflow, with the cost of 7.2% power increasement.

引用

页码：278 / 282

页数：5

共 18 条

[1]

Alle M, 2013, DES AUT CON

[2]

cadence, C-to-Silicon

[3]

calitateaer, About Us

[4] HLS-Based Optimization and Design Space Exploration for Applications with Variable Loop Bounds [J].

Choi, Young-kyu ;

Cong, Jason .

2018 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD) DIGEST OF TECHNICAL PAPERS, 2018,

[5]

Cilardo A, 2015, DES AUT TEST EUROPE, P163

[6] Graph-Theoretically Optimal Memory Banking for Stencil-Based Computing Kernels [J].

Escobedo, Juan ;

Lin, Mingjie .

PROCEEDINGS OF THE 2018 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'18), 2018, :199-208

[7]

github, HLS-CNN Code

[8] Hi-ClockFlow: Multi-Clock Dataflow Automation and Throughput Optimization in High-Level Synthesis [J].

Liang, Tingyuan ;

Zhao, Jieru ;

Feng, Liang ;

Sinha, Sharad ;

Zhang, Wei .

2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,

[9] Acceleration by Inline Cache for Memory-Intensive Algorithms on FPGA via High-Level Synthesis [J].

Ma, Liang ;

Lavagno, Luciano ;

Lazarescu, Mihai Teodor ;

Arif, Arslan .

IEEE ACCESS, 2017, 5 :18953-18974

[10]

microsemi, Synphony model compiler

← 1 2 →