Recurrent Neural Networks: An Embedded Computing Perspective

被引:64
作者
Rezk, Nesma M. [1 ]
Purnaprajna, Madhura [2 ]
Nordstrom, Tomas [3 ]
Ul-Abdin, Zain [1 ]
机构
[1] Halmstad Univ, Sch Informat Technol, S-30118 Halmstad, Sweden
[2] Amrita Vishwa Vidyapeetham, Bengaluru 560035, India
[3] Umea Univ, Dept Appl Phys & Elect TFE, S-90187 Umea, Sweden
关键词
Compression; flexibility; efficiency; embedded computing; long short term memory (LSTM); quantization; recurrent neural networks (RNNs); ARCHITECTURES; TIME;
D O I
10.1109/ACCESS.2020.2982416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recurrent Neural Networks (RNNs) are a class of machine learning algorithms used for applications with time-series and sequential data. Recently, there has been a strong interest in executing RNNs on embedded devices. However, difficulties have arisen because RNN requires high computational capability and a large memory space. In this paper, we review existing implementations of RNN models on embedded platforms and discuss the methods adopted to overcome the limitations of embedded systems. We will define the objectives of mapping RNN algorithms on embedded platforms and the challenges facing their realization. Then, we explain the components of RNN models from an implementation perspective. We also discuss the optimizations applied to RNNs to run efficiently on embedded platforms. Finally, we compare the defined objectives with the implementations and highlight some open research questions and aspects currently not addressed for embedded RNNs. Overall, applying algorithmic optimizations to RNN models and decreasing the memory access overhead is vital to obtain high efficiency. To further increase the implementation efficiency, we point up the more promising optimizations that could be applied in future research. Additionally, this article observes that high performance has been targeted by many implementations, while flexibility has, as yet, been attempted less often. Thus, the article provides some guidelines for RNN hardware designers to support flexibility in a better manner.
引用
收藏
页码:57967 / 57996
页数:30
相关论文
共 120 条
[1]   PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference [J].
Ankit, Aayush ;
El Hajj, Izzat ;
Chalamalasetti, Sai Rahul ;
Ndu, Geoffrey ;
Foltin, Martin ;
Williams, R. Stanley ;
Faraboschi, Paolo ;
Hwu, Wen-mei ;
Strachan, John Paul ;
Roy, Kaushik ;
Milojicic, Dejan S. .
TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, :715-731
[2]  
Annunziata AJ, 2011, 2011 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM)
[3]  
[Anonymous], 2016, ARXIV160801230
[4]  
[Anonymous], 2016, P 20 SIGNLL C COMP N
[5]  
[Anonymous], 2013, PREPRINT ARXIV 1308
[6]  
[Anonymous], 2016, ARXIV161205571
[7]  
[Anonymous], 2016, 2016 IEEE INT, DOI DOI 10.1109/SiPS.2016.48
[8]  
[Anonymous], 2018, ARXIV180311389
[9]  
[Anonymous], Building a large annotated corpus of English: The Penn, DOI DOI 10.21236/ADA273556
[10]  
[Anonymous], P IEEE INT S CIR SYS