Efficient Compilation and Mapping of Fixed Function Combinational Logic onto Digital Signal Processors Targeting Neural Network Inference and Utilizing High-level Synthesis

被引:0
作者
Shahsavani, Soheil Nazar [1 ]
Fayyazi, Arash [1 ]
Nazemi, Mahdi [1 ]
Pedram, Massoud [1 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, 3740 McClintock Ave, Los Angeles, CA 90089 USA
基金
美国国家科学基金会;
关键词
Digital signal processors; high-level synthesis; Boolean function; FPGA devices;
D O I
10.1145/3559543
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent efforts for improving the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic. Mapping such large Boolean functions with many input variables and product terms to digital signal processors (DSPs) on Field-programmable gate arrays (FPGAs) needs a novel framework considering the structure and reconfigurability of DSP blocks during this process. The proposed methodology in this article maps the fixed function combinational logic blocks to a set of Boolean functions where Boolean operations corresponding to each function are mapped to DSP devices rather than look-up tables on the FPGAs to take advantage of the high performance, low latency, and parallelism of DSP blocks. This article also presents an innovative design and optimization methodology for compilation and mapping of NNs, utilizing fixed function combinational logic to DSPs on FPGAs employing high-level synthesis flow. Our experimental evaluations across several datasets and selected NNs demonstrate the comparable performance of our framework in terms of the inference latency and output accuracy compared to prior art FPGA-based NN accelerators employing DSPs.
引用
收藏
页数:25
相关论文
共 21 条
[1]  
Brayton R, 2010, LECT NOTES COMPUT SC, V6174, P24, DOI 10.1007/978-3-642-14295-6_5
[2]  
Courbariaux M, 2016, Arxiv, DOI arXiv:1602.02830
[3]   Fundamental Technologies in Modern Speech Recognition [J].
Furui, Sadaoki ;
Deng, Li ;
Gales, Mark ;
Ney, Hermann ;
Tokuda, Keiichi .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :16-17
[4]  
Krizhevsky A., 2009, CIFAR-100 Dataset
[5]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[6]  
Nazemi M, 2021, Arxiv, DOI arXiv:2104.05421
[7]   NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic [J].
Nazemi, Mahdi ;
Fayyazi, Arash ;
Esmaili, Amirhossein ;
Khare, Atharva ;
Shahsavani, Soheil Nazar ;
Pedram, Massoud .
2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, :266-267
[8]   Energy-Efficient, Low-Latency Realization of Neural Networks through Boolean Logic Minimization [J].
Nazemi, Mahdi ;
Pasandi, Ghasem ;
Pedram, Massoud .
24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, :274-279
[9]   XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks [J].
Rastegari, Mohammad ;
Ordonez, Vicente ;
Redmon, Joseph ;
Farhadi, Ali .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :525-542
[10]   Multipumping Flexible DSP Blocks for Resource Reduction on Xilinx FPGAs [J].
Ronak, Bajaj ;
Fahmy, Suhaib A. .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (09) :1471-1482