CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

被引:72
作者
Chen, Zhiyu [1 ]
Yu, Zhanghao [1 ]
Jin, Qing [2 ]
He, Yan [1 ]
Wang, Jingyu [1 ]
Lin, Sheng [2 ]
Li, Dai [1 ]
Wang, Yanzhi [2 ]
Yang, Kaiyuan [1 ]
机构
[1] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77005 USA
[2] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
关键词
CMOS; convolutional neural networks (CNNs); deep learning accelerator; in-memory computation; mixed-signal computation; static random-access memory (SRAM); SRAM; MACRO; ACCELERATOR; COMPUTATION;
D O I
10.1109/JSSC.2021.3056447
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference. It leverages a novel charge-domain multiply-and-accumulate (MAC) mechanism and circuitry to achieve superior linearity under process variations compared to conventional IMC designs. The adopted semi-parallel architecture efficiently stores filters from multiple CNN layers by sharing eight standard 6T SRAM cells with one charge-domain MAC circuit. Moreover, up to six levels of bit-width of weights with two encoding schemes and eight levels of input activations are supported. A 7-bit charge-injection SAR (ciSAR) analog-to-digital converter (ADC) getting rid of sample and hold (S&H) and input/reference buffers further improves the overall energy efficiency and throughput. A 65-nm prototype validates the excellent linearity and computing accuracy of CAP-RAM. A single 512 x 128 macro stores a complete pruned and quantized CNN model to achieve 98.8% inference accuracy on the MNIST data set and 89.0% on the CIFAR-10 data set, with a 573.4-giga operations per second (GOPS) peak throughput and a 49.4-tera operations per second (TOPS)/W energy efficiency.
引用
收藏
页码:1924 / 1935
页数:12
相关论文
共 35 条
  • [1] [Anonymous], 2018, ARXIV181104047
  • [2] CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks
    Biswas, Avishek
    Chandrakasan, Anantha P.
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 217 - 230
  • [3] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
    Chen, Yu-Hsin
    Krishna, Tushar
    Emer, Joel S.
    Sze, Vivienne
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
  • [4] Choo KD, 2016, ISSCC DIG TECH PAP I, V59, P460, DOI 10.1109/ISSCC.2016.7418106
  • [5] ShiDianNao: Shifting Vision Processing Closer to the Sensor
    Du, Zidong
    Fasthuber, Robert
    Chen, Tianshi
    Ienne, Paolo
    Li, Ling
    Luo, Tao
    Feng, Xiaobing
    Chen, Yunji
    Temam, Olivier
    [J]. 2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 92 - 104
  • [6] Gonugondla SK, 2018, ISSCC DIG TECH PAP I, P490, DOI 10.1109/ISSCC.2018.8310398
  • [7] Guo RQ, 2019, SYMP VLSI CIRCUITS, pC120, DOI [10.23919/VLSIC.2019.8778028, 10.23919/vlsic.2019.8778028]
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Horowitz M, 2014, ISSCC DIG TECH PAP I, V57, P10, DOI 10.1109/ISSCC.2014.6757323
  • [10] A Programmable Heterogeneous Microprocessor Based on Bit-Scalable In-Memory Computing
    Jia, Hongyang
    Valavi, Hossein
    Tang, Yinqi
    Zhang, Jintao
    Verma, Naveen
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (09) : 2609 - 2621