Accelerating machine learning at the edge with approximate on FPGAs

被引:0
作者
Leon-Vega, Luis Gerardo [1 ]
Salazar-Villalobos, Eduardo [2 ]
Castro-Godinez, Jorge [3 ]
机构
[1] Inst Tecnol Costa Rica, Cartago, Costa Rica
[2] Univ Trieste, Trieste, Italy
[3] Inst Tecnol Costa Rica, Sch Elect Engn, Cartago, Costa Rica
来源
TECNOLOGIA EN MARCHA | 2022年 / 35卷
关键词
Approximate computing; edge computing; machine learning; neural networks; linear algebra;
D O I
10.18845/tm.v35i9.6491
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Performing inference of complex machine learning (ML) algorithms at the edge is becoming important to unlink the system functionality from the cloud. However, the ML models increase complexity faster than the available hardware resources. This research aims to accelerate machine learning by offloading the computation to low -end FPGAs and using approximate computing techniques to optimise resource usage, taking advantage of the inaccurate nature of machine learning models. In this paper, we propose a generic matrix multiply -add processing element design, parameterised in datatype, matrix size, and data width. We evaluate the resource consumption and error behaviour while varying the matrix size and the data width given a fixed-point data type. We determine that the error scales with the matrix size, but it can be compensated by increasing the data width, posing a trade-off between data width and matrix size with respect to the error.
引用
收藏
页数:54
相关论文
共 9 条
  • [1] Gonzalez T., 2021, 5 JORN COST INV COMP
  • [2] Intel, 2021, Intel Architecture Instruction Set Extensions and Future Features
  • [3] Fast Algorithms for Convolutional Neural Networks
    Lavin, Andrew
    Gray, Scott
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4013 - 4021
  • [4] Pruning and quantization for deep neural network acceleration: A survey
    Liang, Tailin
    Glossner, John
    Wang, Lei
    Shi, Shaobo
    Zhang, Xiaotong
    [J]. NEUROCOMPUTING, 2021, 461 : 370 - 403
  • [5] NVIDIA Corporation, 2017, NVIDIA Tesla V100 GPU Architecture
  • [6] Salazar-Villalobos Eduardo, 2022, Zenodo, DOI 10.5281/ZENODO.6272004
  • [7] High-Level Synthesis Design Space Exploration: Past, Present, and Future
    Schafer, Benjamin Carrion
    Wang, Zi
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (10) : 2628 - 2639
  • [8] Learning from the Past: Efficient High-level Synthesis Design Space Exploration for FPGAs
    Wang, Zi
    Schafer, Benjamin Carrion
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (04)
  • [9] Machine Learning at Facebook: Understanding Inference at the Edge
    Wu, Carole-Jean
    Brooks, David
    Chen, Kevin
    Chen, Douglas
    Choudhury, Sy
    Dukhan, Marat
    Hazelwood, Kim
    Isaac, Eldad
    Jia, Yangqing
    Jia, Bill
    Leyvand, Tommer
    Lu, Hao
    Lu, Yang
    Qiao, Lin
    Reagen, Brandon
    Spisak, Joe
    Sun, Fei
    Tulloch, Andrew
    Vajda, Peter
    Wang, Xiaodong
    Wang, Yanghan
    Wasti, Bram
    Wu, Yiming
    Xian, Ran
    Yoo, Sungjoo
    Zhang, Peizhao
    [J]. 2019 25TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2019, : 331 - 344