Fully Parallel Stochastic Computing Hardware Implementation of Convolutional Neural Networks for Edge Computing Applications

被引：26

作者：

Frasser, Christiam F. ^{[1
]}

Linares-Serrano, Pablo ^{[2
]}

de los Rios, Ivan Diez ^{[2
]}

Moran, Alejandro ^{[1
]}

Skibinsky-Gitlin, Erik S. ^{[1
]}

Font-Rossello, Joan ^{[1
,3
]}

Canals, Vincent ^{[1
,3
]}

Roca, Miquel ^{[1
,3
]}

Serrano-Gotarredona, Teresa ^{[2
]}

Rossello, Josep L. ^{[1
,3
]}

机构：

[1] Univ Balearic Isl, Elect Engn Grp, Ind Engn & Construct Dept, Palma De Mallorca 07122, Spain

[2] CSIC, Inst Microelect Sevilla, IMSE, CNM, Seville 41092, Spain

[3] Balearic Isl Hlth Res Inst IdISBa, Palma De Mallorca 07120, Spain

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 12期

关键词：

Logic gates; Hardware; Correlation; Computer architecture; Convolutional neural networks; Internet of Things; Europe; Convolutional neural networks (CNNs); edge computing (EC); stochastic computing (SC);

D O I：

10.1109/TNNLS.2022.3166799

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Edge artificial intelligence (AI) is receiving a tremendous amount of interest from the machine learning community due to the ever-increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for typical deep learning techniques such as convolutional neural networks (CNNs). In this work, we propose a power-and-area efficient architecture based on the exploitation of the correlation phenomenon in stochastic computing (SC) systems. The proposed architecture solves the challenges that a CNN implementation with SC (SC-CNN) may present, such as the high resources used in binary-to-stochastic conversion, the inaccuracy produced by undesired correlation between signals, and the complexity of the stochastic maximum function implementation. To prove that our architecture meets the requirements of edge intelligence realization, we embed a fully parallel CNN in a single field-programmable gate array (FPGA) chip. The results obtained showed a better performance than traditional binary logic and other SC implementations. In addition, we performed a full VLSI synthesis of the proposed design, showing that it presents better overall characteristics than other recently published VLSI architectures.

引用

页码：10408 / 10418

页数：11

共 39 条

[1] Accurate and compact convolutional neural network based on stochastic computing [J].

Abdellatef, Hamdan ;

Khalil-Hani, Mohamed ;

Shaikh-Husin, Nasir ;

Ayat, Sayed Omid .

NEUROCOMPUTING, 2022, 471 :31-47

[2] All-Passive Hardware Implementation of Multilayer Perceptron Classifiers [J].

Ananthakrishnan, Akshay ;

Allen, Mark G. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (09) :4086-4095

[3]

[Anonymous], The MNIST database of handwritten digits

[4] A New Stochastic Computing Methodology for Efficient Neural Network Implementation [J].

Canals, Vincent ;

Morro, Antoni ;

Oliver, Antoni ;

Alomar, Miquel L. ;

Rossello, Josep L. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (03) :551-564

[5] Stochastic-based pattern-recognition analysis [J].

Canals, Vincent ;

Morro, Antoni ;

Rossello, Josep L. .

PATTERN RECOGNITION LETTERS, 2010, 31 (15) :2353-2356

[6] Accelerating Deep Neural Networks implementation: A survey [J].

Dhouibi, Meriam ;

Ben Salem, Ahmed Karim ;

Saidi, Afef ;

Ben Saoud, Slim .

IET COMPUTERS AND DIGITAL TECHNIQUES, 2021, 15 (02) :79-96

[7] High-Performance Acceleration of 2-D and 3-D CNNs on FPGAs Using Static Block Floating Point [J].

Fan, Hongxiang ;

Liu, Shuanglong ;

Que, Zhiqiang ;

Niu, Xinyu ;

Luk, Wayne .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) :4473-4487

[8]

Gidel Company, P 10A BOARD IM

[9] EIE: Efficient Inference Engine on Compressed Deep Neural Network [J].

Han, Song ;

Liu, Xingyu ;

Mao, Huizi ;

Pu, Jing ;

Pedram, Ardavan ;

Horowitz, Mark A. ;

Dally, William J. .

2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :243-254

[10] FPGA-Based High-Throughput CNN Hardware Accelerator With High Computing Resource Utilization Ratio [J].

Huang, Wenjin ;

Wu, Huangtao ;

Chen, Qingkun ;

Luo, Conghui ;

Zeng, Shihao ;

Li, Tianrui ;

Huang, Yihua .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) :4069-4083

← 1 2 3 4 →