Decimal floating-point fused multiply-add with redundant internal encodings

被引：2

作者：

Han, Liu ^{[1
]}

Zhang, Hao ^{[1
]}

Ko, Seok-Bum ^{[1
]}

机构：

[1] Univ Saskatchewan, Dept Elect & Comp Engn, 57 Campus Dr, Saskatoon, SK S7N 5A9, Canada

来源：

IET COMPUTERS AND DIGITAL TECHNIQUES | 2016年 / 10卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

floating point arithmetic; encoding; decimal floating-point fused multiply-add; redundant internal encodings; DFP arithmetic; FMA function; decimal redundant encoding system; rounding operation; critical path reduction; UNIT;

D O I：

10.1049/iet-cdt.2015.0058

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Decimal floating-point (DFP) arithmetic has attracted attention in the applications of financial and commercial computing. However, the processing efficiency of DFP is still far away from that of binary designs. On the other hand, a floating-point fused multiply-add (FMA) function is widely used in many processors within functional iterations to implement division, square root, and many other functions due to the better accuracy achieved by a single rounding of continuous multiplication and addition. In this work, a new architecture of FMA is proposed to speed up the DFP processing. Compared with previous architectures, first, the proposed design applies a specific decimal redundant encoding system. The circuits to decide and shift the rounding position on a redundant result are therefore simplified. Second, the only digit-set conversion in the entire design is combined with the rounding operation to further reduce the critical path. Third, the techniques applied in different previous FMAs are merged in the proposed design. In addition the multiplier and adder referred to the previous designs are further optimised. Consequently, compared with the fastest previous design, the synthesis results show about 33.7% speed advantage and about 16.6% area advantage.

引用

页码：147 / 156

页数：10

共 50 条

[21] Floating-point multiply-add-fused with reduced latency
Lang, T
Bruguera, JD
IEEE TRANSACTIONS ON COMPUTERS, 2004, 53 (08) : 988 - 1003
[22] Design of Low-Cost High-performance Floating-point Fused Multiply-Add with Reduced Power
Qi, Zichu
Guo, Qi
Zhang, Ge
Li, Xiangku
Hu, Weiwu
23RD INTERNATIONAL CONFERENCE ON VLSI DESIGN, 2010, : 206 - 211
[23] Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit
Nievergelt, Y
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2003, 29 (01): : 27 - 48
[24] A Configurable Length, Fused Multiply-Add Floating Point Unit for a VLIW Processor
Chouliaras, V. A.
Manolopoulos, K.
Reisis, D.
IEEE INTERNATIONAL SOC CONFERENCE, PROCEEDINGS, 2009, : 93 - +
[25] 2ND-GENERATION RISC FLOATING POINT WITH MULTIPLY-ADD FUSED
HOKENEK, E
MONTOYE, RK
COOK, PW
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1990, 25 (05) : 1207 - 1213
[26] Improved Fused Floating Point Add-Subtract and Multiply-Add Unit for FFT Implementation
Palsodkar, Prasanna
Gurjar, Ajay
2014 2ND INTERNATIONAL CONFERENCE ON DEVICES, CIRCUITS AND SYSTEMS (ICDCS), 2014,
[27] Low-precision DSP-based floating-point multiply-add fused for Field Programmable Gate Arrays
Amaricai, Alexandru
Boncalo, Oana
Gavriliu, Constantina-Elena
IET COMPUTERS AND DIGITAL TECHNIQUES, 2014, 8 (04): : 187 - 197
[28] Speculative Hardware/Software Co-Designed Floating-Point Multiply-Add Fusion
Lupon, Marc
Gibert, Enric
Magklis, Grigorios
Samudrala, Sridhar
Martinez, Raul
Stavrou, Kyriakos
Ditzel, David R.
ACM SIGPLAN NOTICES, 2014, 49 (04) : 623 - 638
[29] A novel architecture for floating-point multiply-add-fused operation
Sun, HP
Gao, ML
ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1675 - 1679
[30] Redundant Floating-Point Decimal CORDIC Algorithm
Vazquez, Alvaro
Villalba-Moreno, Julio
Antelo, Elisardo
Zapata, Emilio L.
IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (11) : 1551 - 1562

← 1 2 3 4 5 →