A Local Computing Cell and 6T SRAM-Based Computing-in-Memory Macro With 8-b MAC Operation for Edge AI Chips

被引:74
作者
Si, Xin [1 ]
Tu, Yung-Ning [2 ]
Huang, Wei-Hsing [2 ]
Su, Jian-Wei [2 ]
Lu, Pei-Jung [2 ]
Wang, Jing-Hong [2 ]
Liu, Ta-Wei [2 ]
Wu, Ssu-Yen [2 ]
Liu, Ruhui [2 ]
Chou, Yen-Chi [2 ]
Chung, Yen-Lin [2 ]
Shih, William [2 ]
Lo, Chung-Chuan [2 ]
Liu, Ren-Shuo [2 ]
Hsieh, Chih-Cheng [2 ]
Tang, Kea-Tiong [2 ]
Lien, Nan-Chun [3 ]
Shih, Wei-Chiang [3 ]
He, Yajuan [4 ]
Li, Qiang [4 ]
Chang, Meng-Fan [2 ]
机构
[1] Southeast Univ, Natl ASIC Syst Engn Res Ctr, Sch Elect Sci & Engn, Nanjing 210096, Peoples R China
[2] Natl Tsing Hua Univ NTHU, Dept Elect Engn, Hsinchu 30013, Taiwan
[3] M31 Technol, Hsinchu 30013, Taiwan
[4] Univ Elect Sci & Technol China UESTC, Inst Integrated Circuits & Syst, Chengdu 610054, Peoples R China
关键词
SRAM cells; Common Information Model (computing); Computer architecture; Transistors; Software; Neural networks; Microprocessors; Artificial intelligence (AI); computing-in-memory; local computing cell (LCC); random access memory; weight-bitwise MAC (WbwMAC) operation; UNIT-MACRO; COMPUTATION;
D O I
10.1109/JSSC.2021.3073254
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article presents a computing-in-memory (CIM) structure aimed at improving the energy efficiency of edge devices running multi-bit multiply-and-accumulate (MAC) operations. The proposed scheme includes a 6T SRAM-based CIM (SRAM-CIM) macro capable of: 1) weight-bitwise MAC (WbwMAC) operations to expand the sensing margin and improve the readout accuracy for high-precision MAC operations; 2) a compact 6T local computing cell to perform multiplication with suppressed sensitivity to process variation; 3) an algorithm-adaptive low MAC-aware readout scheme to improve energy efficiency; 4) a bitline header selection scheme to enlarge signal margin; and 5) a small-offset margin-enhanced sense amplifier for robust read operations against process variation. A fabricated 28-nm 64-kb SRAM-CIM macro achieved access times of 4.1-8.4 ns with energy efficiency of 11.5-68.4 TOPS/W, while performing MAC operations with 4- or 8-b input and weight precision.
引用
收藏
页码:2817 / 2831
页数:15
相关论文
共 37 条
  • [1] Agrawal A., 2018, ARXIV180700343
  • [2] X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories
    Agrawal, Amogh
    Jaiswal, Akhilesh
    Lee, Chankyu
    Roy, Kaushik
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (12) : 4219 - 4232
  • [3] [Anonymous], 2017, IEEE INT SOL STATE C
  • [4] [Anonymous], 2018, P CUST INT CIRC C
  • [5] Bankman D, 2018, ISSCC DIG TECH PAP I, P222, DOI 10.1109/ISSCC.2018.8310264
  • [6] CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks
    Biswas, Avishek
    Chandrakasan, Anantha P.
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) : 217 - 230
  • [7] Biswas A, 2018, ISSCC DIG TECH PAP I, P488, DOI 10.1109/ISSCC.2018.8310397
  • [8] Chandan M., 2018, P 19 INT C DISTR COM, P1
  • [9] Chen WH, 2018, ISSCC DIG TECH PAP I, P494, DOI 10.1109/ISSCC.2018.8310400
  • [10] Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007