Accuracy, Training Time and Hardware Efficiency Trade-Offs for Quantized Neural Networks on FPGAs

被引:9
作者
Bacchus, Pascal [1 ]
Stewart, Robert [1 ]
Komendantskaya, Ekaterina [1 ]
机构
[1] Heriot Watt Univ, Math & Comp Sci, Edinburgh, Midlothian, Scotland
来源
APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2020 | 2020年 / 12083卷
基金
英国工程与自然科学研究理事会;
关键词
Deep learning; Neural networks; Quantization; FPGA;
D O I
10.1007/978-3-030-44534-8_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks have proven a successful AI approach in many application areas. Some neural network deployments require low inference latency and lower power requirements to be useful e.g. autonomous vehicles and smart drones. Whilst FPGAs meet these requirements, hardware needs of neural networks to execute often exceed FPGA resources. Emerging industry led frameworks aim to solve this problem by compressing the topology and precision of neural networks, eliminating computations that require memory for execution. Compressing neural networks inevitably comes at the cost of reduced inference accuracy. This paper uses Xilinx's FINN framework to systematically evaluate the trade-off between precision, inference accuracy, training time and hardware resources of 64 quantized neural networks that perform MNIST character recognition. We identify sweet spots around 3 bit precision in the quantization design space after training with 40 epochs, minimising both hardware resources and accuracy loss. With enough training, using 2 bit weights achieves almost the same inference accuracy as 3-8 bit weights.
引用
收藏
页码:121 / 135
页数:15
相关论文
共 29 条
[1]  
Abadi M., 2015, TENSORFLOW LARGE SCA
[2]  
[Anonymous], 2016, CoRR abs/1602.02830
[3]   FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks [J].
Blott, Michaela ;
Preusser, Thomas B. ;
Fraser, Nicholas J. ;
Gambardella, Giulio ;
O'Brien, Kenneth ;
Umuroglu, Yaman ;
Leeser, Miriam ;
Vissers, Kees .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (03)
[4]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[5]  
Cheng Y., 2015, Fast neural networks with circulant projections
[6]  
DiCecco R, 2016, 2016 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), P265, DOI 10.1109/FPT.2016.7929549
[7]  
Dieleman Sander, 2015, Zenodo
[8]   REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs [J].
Ding, Caiwen ;
Wang, Shuo ;
Liu, Ning ;
Xu, Kaidi ;
Wang, Yanzhi ;
Liang, Yun .
PROCEEDINGS OF THE 2019 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS (FPGA'19), 2019, :33-42
[9]  
Ghasemzadeh M., 2017, CoRR abs/1711.01243
[10]  
Han S, 2015, ADV NEUR IN, V28