CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

被引：72

作者：

Chen, Zhiyu ^{[1
]}

Yu, Zhanghao ^{[1
]}

Jin, Qing ^{[2
]}

He, Yan ^{[1
]}

Wang, Jingyu ^{[1
]}

Lin, Sheng ^{[2
]}

Li, Dai ^{[1
]}

Wang, Yanzhi ^{[2
]}

Yang, Kaiyuan ^{[1
]}

机构：

[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA

[2] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2021年 / 56卷 / 06期

关键词：

CMOS; convolutional neural networks (CNNs); deep learning accelerator; in-memory computation; mixed-signal computation; static random-access memory (SRAM); SRAM; MACRO; ACCELERATOR; COMPUTATION;

D O I：

10.1109/JSSC.2021.3056447

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference. It leverages a novel charge-domain multiply-and-accumulate (MAC) mechanism and circuitry to achieve superior linearity under process variations compared to conventional IMC designs. The adopted semi-parallel architecture efficiently stores filters from multiple CNN layers by sharing eight standard 6T SRAM cells with one charge-domain MAC circuit. Moreover, up to six levels of bit-width of weights with two encoding schemes and eight levels of input activations are supported. A 7-bit charge-injection SAR (ciSAR) analog-to-digital converter (ADC) getting rid of sample and hold (S&H) and input/reference buffers further improves the overall energy efficiency and throughput. A 65-nm prototype validates the excellent linearity and computing accuracy of CAP-RAM. A single 512 x 128 macro stores a complete pruned and quantized CNN model to achieve 98.8% inference accuracy on the MNIST data set and 89.0% on the CIFAR-10 data set, with a 573.4-giga operations per second (GOPS) peak throughput and a 49.4-tera operations per second (TOPS)/W energy efficiency.

引用

页码：1924 / 1935

页数：12

共 35 条

[1] [Anonymous], 2018, ARXIV181104047
[2] CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks
Biswas, Avishek
Chandrakasan, Anantha P.
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 217 - 230
[3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[4] Choo KD, 2016, ISSCC DIG TECH PAP I, V59, P460, DOI 10.1109/ISSCC.2016.7418106
[5] ShiDianNao: Shifting Vision Processing Closer to the Sensor
Du, Zidong
Fasthuber, Robert
Chen, Tianshi
Ienne, Paolo
Li, Ling
Luo, Tao
Feng, Xiaobing
Chen, Yunji
Temam, Olivier
[J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 92 - 104
[6] Gonugondla SK, 2018, ISSCC DIG TECH PAP I, P490, DOI 10.1109/ISSCC.2018.8310398
[7] Guo RQ, 2019, SYMP VLSI CIRCUITS, pC120, DOI [10.23919/VLSIC.2019.8778028, 10.23919/vlsic.2019.8778028]
[8] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[9] Horowitz M, 2014, ISSCC DIG TECH PAP I, V57, P10, DOI 10.1109/ISSCC.2014.6757323
[10] A Programmable Heterogeneous Microprocessor Based on Bit-Scalable In-Memory Computing
Jia, Hongyang
Valavi, Hossein
Tang, Yinqi
Zhang, Jintao
Verma, Naveen
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (09) : 2609 - 2621

← 1 2 3 4 →