LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network Inference

被引:35
作者
Wang, Erwei [1 ]
Davis, James J. [1 ]
Cheung, Peter Y. K. [1 ]
Constantinides, George A. [1 ]
机构
[1] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ, England
基金
英国工程与自然科学研究理事会;
关键词
Deep neural network; hardware architecture; field-programmable gate array; lookup table;
D O I
10.1109/TC.2020.2978817
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Research has shown that deep neural networks contain significant redundancy, and thus that high classification accuracy can be achieved even when weights and activations are quantized down to binary values. Network binarization on FPGAs greatly increases area efficiency by replacing resource-hungry multipliers with lightweight XNOR gates. However, an FPGA's fundamental building block, the K-LUT, is capable of implementing far more than an XNOR: it can perform any K-input Boolean operation. Inspired by this observation, we propose LUTNet, an end-to-end hardware-software framework for the construction of area-efficient FPGA-based neural network accelerators using the native LUTs as inference operators. We describe the realization of both unrolled and tiled LUTNet architectures, with the latter facilitating smaller, less power-hungry deployment over the former while sacrificing area and energy efficiency along with throughput. For both varieties, we demonstrate that the exploitation of LUT flexibility allows for far heavier pruning than possible in prior works, resulting in significant area savings while achieving comparable accuracy. Against the state-of-the-art binarized neural network implementation, we achieve up to twice the area efficiency for several standard network models when inferencing popular datasets. We also demonstrate that even greater energy efficiency improvements are obtainable.
引用
收藏
页码:1795 / 1808
页数:14
相关论文
共 35 条
[1]   Math Doesn't Have to be Hard: Logic Block Architectures to Enhance Low-Precision Multiply-Accumulate on FPGAs [J].
Boutros, Andrew ;
Eldafrawy, Mohamed ;
Yazdanshenas, Sadegh ;
Betz, Vaughn .
PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, :94-103
[2]  
Chatterjee S., 2018, P INT C MACH LEARN
[3]  
Courbariaux M., 2015, P INT C NEUR INF PRO
[4]  
Davis JJ, 2017, I C FIELD PROG LOGIC
[5]   KAPow: High-Accuracy, Low-Overhead Online Per-Module Power Estimation for FPGA Designs [J].
Davis, James J. ;
Hung, Eddie ;
Levine, Joshua M. ;
Stott, Edward A. ;
Cheung, Peter Y. K. ;
Constantinides, George A. .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (01)
[6]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7]   ReBNet: Residual Binarized Neural Network [J].
Ghasemzadeh, Mohammad ;
Samragh, Mohammad ;
Koushanfar, Farinaz .
PROCEEDINGS 26TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2018), 2018, :57-64
[8]  
Guo K., 2017, ACM T RECONFIGURABLE, V9
[9]  
Han S, 2015, ADV NEUR IN, V28
[10]  
Hubara I, 2016, ADV NEUR IN, V29