Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference

被引：2

作者：

Kazerooni-Zand, Reza ^{[1
]}

Kamal, Mehdi ^{[2
]}

Afzali-Kusha, Ali ^{[1
,3
]}

Pedram, Massoud ^{[2
]}

机构：

[1] Univ Tehran, Sch Elect & Comp Engn, Coll Engn, North Karegar St, Tehran 1439957131, Iran

[2] Univ Southern Calif, Elect & Comp Engn Dept, 3740 McClintock Ave, Los Angeles, CA USA

[3] Inst Res Fundamental Sci IPM, Sch Comp Sci, Lavasani St, Tehran 1953833511, Iran

来源：

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS | 2023年 / 28卷 / 04期

关键词：

Coarse-grained reconfigurable architecture; accelerator; memristor; Convolutional Neural Network; RELIABILITY IMPROVEMENT; ENERGY; OPTIMIZATION;

D O I：

10.1145/3595638

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a mixed-signal coarse-grained reconfigurable architecture (CGRA) for accelerating inference in deep neural networks (DNNs) is presented. It is based on performing dot-product computations using analog computing to achieve a considerable speed improvement. Other computations are performed digitally. In the proposed structure (called MX-CGRA), analog tiles consisting of memristor crossbars are employed. To reduce the overhead of converting the data between analog and digital domains, we utilize a proper interface between the analog and digital tiles. In addition, the structure benefits from an efficient memory hierarchy where the data is moved as close as possible to the computing fabric. Moreover, to fully utilize the tiles, we define a set of micro instructions to configure the analog and digital domains. Corresponding context words used in the CGRA are determined by these instructions (generated by a companion compiler tool). The efficacy of the MX-CGRA is assessed by modeling the execution of state-of-the-art DNN architectures on this structure. The architectures are used to classify images of the ImageNet dataset. Simulation results show that, compared to the previous mixed-signal DNN accelerators, on average, a higher throughput of 2.35 x is achieved.

引用

页数：23

共 58 条

[1] Energy and Reliability Improvement of Voltage-Based, Clustered, Coarse-Grain Reconfigurable Architectures by Employing Quality-Aware Mapping
Afzali-Kusha, Hassan
Akbari, Omid
Kamal, Mehdi
Pedram, Massoud
[J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2018, 8 (03) : 480 - 493
[2] NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
Aimar, Alessandro
Mostafa, Hesham
Calabrese, Enrico
Rios-Navarro, Antonio
Tapiador-Morales, Ricardo
Lungu, Iulia-Alexandra
Milde, Moritz B.
Corradi, Federico
Linares-Barranco, Alejandro
Liu, Shih-Chii
Delbruck, Tobi
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) : 644 - 656
[3] Akbari O, 2018, DES AUT TEST EUROPE, P413, DOI 10.23919/DATE.2018.8342045
[4] In-Memory Vector-Matrix Multiplication in Monolithic Complementary Metal-Oxide-Semiconductor-Memristor Integrated Circuits: Design Choices, Challenges, and Perspectives
Amirsoleimani, Amirali
Alibart, Fabien
Yon, Victor
Xu, Jianxiong
Pazhouhandeh, M. Reza
Ecoffey, Serge
Beilliard, Yann
Genov, Roman
Drouin, Dominique
[J]. ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (11)
[5] Ando Kota, 2017, Circuits and Systems, V8, P149, DOI DOI 10.4236/CS.2017.86010
[6] PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM
Ankit, Aayush
El Hajj, Izzat
Chalamalasetti, Sai Rahul
Agarwal, Sapan
Marinella, Matthew
Foltin, Martin
Strachan, John Paul
Milojicic, Dejan
Hwu, Wen-Mei
Roy, Kaushik
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (08) : 1128 - 1142
[7] PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
Ankit, Aayush
El Hajj, Izzat
Chalamalasetti, Sai Rahul
Ndu, Geoffrey
Foltin, Martin
Williams, R. Stanley
Faraboschi, Paolo
Hwu, Wen-mei
Strachan, John Paul
Roy, Kaushik
Milojicic, Dejan S.
[J]. TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, : 715 - 731
[8] OCTAN: An On-Chip Training Algorithm for Memristive Neuromorphic Circuits
Ansari, Mohammad
Fayyazi, Arash
Kamal, Mehdi
Afzali-Kusha, Ali
Pedram, Massoud
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2019, 66 (12) : 4687 - 4698
[9] Self-driving cars: A survey
Badue, Claudine
Guidolini, Ranik
Carneiro, Raphael Vivacqua
Azevedo, Pedro
Cardoso, Vinicius B.
Forechi, Avelino
Jesus, Luan
Berriel, Rodrigo
Paixao, Thiago M.
Mutz, Filipe
Veronese, Lucas de Paula
Oliveira-Santos, Thiago
De Souza, Alberto F.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[10] Auto-Tuning CNNs for Coarse-Grained Reconfigurable Array-Based Accelerators
Bae, Inpyo
Harris, Barend
Min, Hyemi
Egger, Bernhard
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2301 - 2310

← 1 2 3 4 5 6 →