Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance

被引：51

作者：

Underwood, KD ^{[1
]}

Hemmert, KS ^{[1
]}

机构：

[1] Sandia Natl Labs, Albuquerque, NM 87185 USA

来源：

12TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS | 2004年

关键词：

IEEE floating point; arithmetic; FPGA; reconfigurable computing;

D O I：

10.1109/FCCM.2004.21

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks - as long as floating-point arithmetic is not required. Fueled by the advance of Moore's Law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however is how much of this peak performance can be sustained. This paper examines three of the basic linear algebra subroutine (BLAS) functions: vector dot product, matrix-vector multiply, and matrix multiply. A comparison of microprocessors, FPGAs, and Reconfigurable Computing platforms is performed for each operation. The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs. This analysis considers the historical context of the last six years and is extrapolated for the next six years.

引用

页码：219 / 228

页数：10

共 50 条

[41] Double Precision Hybrid-Mode Floating-Point FPGA CORDIC Co-processor
Zhou, Jie
Dou, Yong
Lei, Yuanwu
Xu, Jinbo
Dong, Yazhuo
HPCC 2008: 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2008, : 182 - 189
[42] LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor
Moberly, Raymond
O'Sullivana, Michael
Waheed, Khurram
ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS XVII, 2007, 6697
[43] An FPGA-based low-cost VLIW floating-point processor for CNC applications
Dong, Jingchuan
Wang, Taiyong
Li, Bo
Liu, Zhe
Yu, Zhigiang
MICROPROCESSORS AND MICROSYSTEMS, 2017, 50 : 14 - 25
[44] Design and Implementation of Differential Evolution Algorithm on FPGA for Double-Precision Floating-Point Representation
Cortes-Antonio, Prometeo
Rangel-Gonzalez, Josue
Villa-Vargas, Luis A.
Antonio Ramirez-Salinas, Marco
Molina-Lozano, Heron
Batyrshin, Ildar
ACTA POLYTECHNICA HUNGARICA, 2014, 11 (04) : 139 - 153
[45] Design and Implementation for Quadruple Precision Floating-point Multiplier Based on FPGA with Lower Resource Occupancy
Kang Lei
Yan Xiao-ying
2014 Fifth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA), 2014, : 326 - 329
[46] An Area-Efficient Iterative Single-Precision Floating-Point Multiplier Architecture for FPGA
Kim, Sunwoong
Rutenbar, Rob A.
GLSVLSI '19 - PROCEEDINGS OF THE 2019 ON GREAT LAKES SYMPOSIUM ON VLSI, 2019, : 87 - 92
[47] FPGA IMPLEMENTATION OF FLOATING-POINT COMPLEX MATRIX INVERSION BASED ON GAUSS-JORDAN ELIMINATION
Moussa, Sherif
Razik, Ahmed M. Abdel
Dahmane, Adel Omar
Hamam, Habib
2013 26TH ANNUAL IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2013, : 557 - 560
[48] FPGA implementation of floating-point LMS adaptive filters using high-level synthesis
Ushenina, Inna, V
VESTNIK TOMSKOGO GOSUDARSTVENNOGO UNIVERSITETA-UPRAVLENIE VYCHISLITELNAJA TEHNIKA I INFORMATIKA-TOMSK STATE UNIVERSITY JOURNAL OF CONTROL AND COMPUTER SCIENCE, 2022, (59): : 108 - 116
[49] Implementation of Vector Floating-point processing unit on FPGAs for high performance computing
Chen, Shi
Venkatesan, Ramachandran
Gillard, Paul
2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 840 - 844
[50] FPGA implementation of an exact dot product and its application in variable-precision floating-point arithmetic
Yuanwu Lei
Yong Dou
Yazhuo Dong
Jie Zhou
Fei Xia
The Journal of Supercomputing, 2013, 64 : 580 - 605

← 1 2 3 4 5 →