CuFP: An HLS Library for Customized Floating-Point Operators

被引：1

作者：

Hajizadeh, Fahimeh ^{[1
]}

Ould-Bachir, Tarek ^{[2
]}

David, Jean Pierre ^{[1
]}

机构：

[1] Polytech Montreal, Dept Elect Engn, Montreal, PQ H3T 1J4, Canada

[2] Polytech Montreal, MOTCE Lab, Dept Comp Engn, Montreal, PQ H3T 1J4, Canada

来源：

ELECTRONICS | 2024年 / 13卷 / 14期

基金：

加拿大自然科学与工程研究理事会;

关键词：

floating-point; high-level synthesis (HLS); FPGA; custom precision; customized floating-point; custom operation; vector summation (VSUM); dot-product (DP); matrix-vector multiplication (MVM);

D O I：

10.3390/electronics13142838

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

High-Level Synthesis (HLS) tools have revolutionized FPGA application development by providing a more efficient and streamlined approach, significantly impacting digital design methodologies. Despite the capability of FPGAs to customize numerical representations in data paths, most HLS projects have focused on fixed-point precision, while floating-point representations remain limited to vendor-provided single, double, and half-precision formats. This paper proposes a customized floating-point library compatible with HLS to address these limitations. This library allows programmers to define the number of exponent and mantissa bits at compile time, providing greater flexibility and enabling the use of mixed precision. Moreover, this library includes optimized implementations of common components such as vector summation (VSUM), dot-product (DP), and matrix-vector multiplication (MVM). Results demonstrate that the proposed library reduces latency and resource utilization compared to vendor IP blocks, particularly in VSUM, DP, and MVM operations. For example, the mvm operation involving a 32 x 32 matrix, using vendor IP requires 22 clock cycles, whereas CuFP completes the same task in just 7 clock cycles, using approximately 60% fewer DSPs, 10% fewer LUTs, and 60% fewer FFs.

引用

页数：22

共 33 条

[1]

Agosta G., 2021, Masters Thesis

[2]

[Anonymous], 2023, Intel UG-01058: Floating-Point IP Cores User Guide

[3]

[Anonymous], 2020, Xilinx PG060

[4]

[Anonymous], 2021, Xilinx UG579

[5]

[Anonymous], 2023, Xilinx UG1399

[6]

[Anonymous], 2021, Xilinx UG902

[7]

[Anonymous], 2024, Xilinx UG900

[8]

Bansal S, 2018, DES AUT TEST EUROPE, P37, DOI 10.23919/DATE.2018.8341976

[9] Resource Optimal Truncated Multipliers for FPGAs [J].

Boettcher, Andreas ;

Kumm, Martin ;

de Dinechin, Florent .

2021 IEEE 28TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH 2021), 2021, :102-109

[10] Towards Globally Optimal Design of Multipliers for FPGAs [J].

Bottcher, Andreas ;

Kumm, Martin .

IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (05) :1261-1273

← 1 2 3 4 →