Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference

被引:2
作者
Kazerooni-Zand, Reza [1 ]
Kamal, Mehdi [2 ]
Afzali-Kusha, Ali [1 ,3 ]
Pedram, Massoud [2 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Coll Engn, North Karegar St, Tehran 1439957131, Iran
[2] Univ Southern Calif, Elect & Comp Engn Dept, 3740 McClintock Ave, Los Angeles, CA USA
[3] Inst Res Fundamental Sci IPM, Sch Comp Sci, Lavasani St, Tehran 1953833511, Iran
关键词
Coarse-grained reconfigurable architecture; accelerator; memristor; Convolutional Neural Network; RELIABILITY IMPROVEMENT; ENERGY; OPTIMIZATION;
D O I
10.1145/3595638
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a mixed-signal coarse-grained reconfigurable architecture (CGRA) for accelerating inference in deep neural networks (DNNs) is presented. It is based on performing dot-product computations using analog computing to achieve a considerable speed improvement. Other computations are performed digitally. In the proposed structure (called MX-CGRA), analog tiles consisting of memristor crossbars are employed. To reduce the overhead of converting the data between analog and digital domains, we utilize a proper interface between the analog and digital tiles. In addition, the structure benefits from an efficient memory hierarchy where the data is moved as close as possible to the computing fabric. Moreover, to fully utilize the tiles, we define a set of micro instructions to configure the analog and digital domains. Corresponding context words used in the CGRA are determined by these instructions (generated by a companion compiler tool). The efficacy of the MX-CGRA is assessed by modeling the execution of state-of-the-art DNN architectures on this structure. The architectures are used to classify images of the ImageNet dataset. Simulation results show that, compared to the previous mixed-signal DNN accelerators, on average, a higher throughput of 2.35 x is achieved.
引用
收藏
页数:23
相关论文
共 58 条
  • [1] Energy and Reliability Improvement of Voltage-Based, Clustered, Coarse-Grain Reconfigurable Architectures by Employing Quality-Aware Mapping
    Afzali-Kusha, Hassan
    Akbari, Omid
    Kamal, Mehdi
    Pedram, Massoud
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2018, 8 (03) : 480 - 493
  • [2] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
    Aimar, Alessandro
    Mostafa, Hesham
    Calabrese, Enrico
    Rios-Navarro, Antonio
    Tapiador-Morales, Ricardo
    Lungu, Iulia-Alexandra
    Milde, Moritz B.
    Corradi, Federico
    Linares-Barranco, Alejandro
    Liu, Shih-Chii
    Delbruck, Tobi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
  • [3] Akbari O, 2018, DES AUT TEST EUROPE, P413, DOI 10.23919/DATE.2018.8342045
  • [4] In-Memory Vector-Matrix Multiplication in Monolithic Complementary Metal-Oxide-Semiconductor-Memristor Integrated Circuits: Design Choices, Challenges, and Perspectives
    Amirsoleimani, Amirali
    Alibart, Fabien
    Yon, Victor
    Xu, Jianxiong
    Pazhouhandeh, M. Reza
    Ecoffey, Serge
    Beilliard, Yann
    Genov, Roman
    Drouin, Dominique
    [J]. ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (11)
  • [5] Ando Kota, 2017, Circuits and Systems, V8, P149, DOI DOI 10.4236/CS.2017.86010
  • [6] PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM
    Ankit, Aayush
    El Hajj, Izzat
    Chalamalasetti, Sai Rahul
    Agarwal, Sapan
    Marinella, Matthew
    Foltin, Martin
    Strachan, John Paul
    Milojicic, Dejan
    Hwu, Wen-Mei
    Roy, Kaushik
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (08) : 1128 - 1142
  • [7] PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
    Ankit, Aayush
    El Hajj, Izzat
    Chalamalasetti, Sai Rahul
    Ndu, Geoffrey
    Foltin, Martin
    Williams, R. Stanley
    Faraboschi, Paolo
    Hwu, Wen-mei
    Strachan, John Paul
    Roy, Kaushik
    Milojicic, Dejan S.
    [J]. TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, : 715 - 731
  • [8] OCTAN: An On-Chip Training Algorithm for Memristive Neuromorphic Circuits
    Ansari, Mohammad
    Fayyazi, Arash
    Kamal, Mehdi
    Afzali-Kusha, Ali
    Pedram, Massoud
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2019, 66 (12) : 4687 - 4698
  • [9] Self-driving cars: A survey
    Badue, Claudine
    Guidolini, Ranik
    Carneiro, Raphael Vivacqua
    Azevedo, Pedro
    Cardoso, Vinicius B.
    Forechi, Avelino
    Jesus, Luan
    Berriel, Rodrigo
    Paixao, Thiago M.
    Mutz, Filipe
    Veronese, Lucas de Paula
    Oliveira-Santos, Thiago
    De Souza, Alberto F.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
  • [10] Auto-Tuning CNNs for Coarse-Grained Reconfigurable Array-Based Accelerators
    Bae, Inpyo
    Harris, Barend
    Min, Hyemi
    Egger, Bernhard
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2301 - 2310