A Dual 7T SRAM-Based Zero-Skipping Compute-In-Memory Macro With 1-6b Binary Searching ADCs for Processing Quantized Neural Networks

被引:1
作者
Yu, Chengshuo [1 ]
Jiang, Haoge [2 ]
Mu, Junjie [2 ]
Chai, Kevin Tshun Chuan [3 ]
Kim, Tony Tae-Hyoung [2 ]
Kim, Bongjin [4 ]
机构
[1] Zhangjiang Lab, Shanghai 201210, Peoples R China
[2] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[3] ASTAR, Inst Microelect, Singapore 138634, Singapore
[4] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
关键词
Voltage; In-memory computing; Random access memory; Neural networks; MOSFET; Energy efficiency; Computational modeling; SRAM; compute-in-memory (CIM); quantized neural networks; multiply and accumulate (MAC); zero-skipping; current-mode accumulation; UNIT-MACRO; PRECISION; WEIGHT; CHIP;
D O I
10.1109/TCSI.2024.3411608
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article presents a novel dual 7T static random-access memory (SRAM)-based compute-in-memory (CIM) macro for processing quantized neural networks. The proposed SRAM-based CIM macro decouples read/write operations and employs a zero-input/weight skipping scheme. A 65nm test chip with 528x128 integrated dual 7T bitcells demonstrated reconfigurable precision multiply and accumulate operations with 384 x binary inputs (0/1) and 384 x128 programmable multi-bit weights (3/7/15-levels). Each column comprises 384 x bitcells for a dot product, 48 x bitcells for offset calibration, and 96 x bitcells for binary-searching analog-to-digital conversion. The analog-to-digital converter (ADC) converts a voltage difference between two read bitlines (i.e., an analog dot-product result) to a 1-6b digital output code using binary searching in 1-6 conversion cycles using replica bitcells. The test chip with 66Kb embedded dual SRAM bitcells was evaluated for processing neural networks, including the MNIST image classifications using a multi-layer perceptron (MLP) model with its layer configuration of 784-256-256-256-10 The measured classification accuracies are 97.62%, 97.65%, and 97.72% for the 3, 7, and 15 level weights, respectively. The accuracy degradations are only 0.58 to 0.74% off the baseline with software simulations. For the VGG6 model using the CIFAR-10 image dataset, the accuracies are 88.59%, 88.21%, and 89.07% for the 3, 7, and 15 level weights, with degradations of only 0.6 to 1.32% off the software baseline. The measured energy efficiencies are 258.5, 67.9, and 23.9 TOPS/W for the 3, 7, and 15 level weights, respectively, measured at 0.45/0.8V supplies.
引用
收藏
页码:3672 / 3682
页数:11
相关论文
共 48 条
[1]   BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W [J].
Ando, Kota ;
Ueyoshi, Kodai ;
Orimo, Kentaro ;
Yonekawa, Haruyoshi ;
Sato, Shimpei ;
Nakahara, Hiroki ;
Takamaeda-Yamazaki, Shinya ;
Ikebe, Masayuki ;
Asai, Tetsuya ;
Kuroda, Tadahiro ;
Motomura, Masato .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (04) :983-994
[2]  
[Anonymous], 2011, Lessons Learned From Manually Classi- fying CIFAR-10
[3]   An Always-On 3.8 μJ/86% CIFAR-10 Mixed-Signal Binary CNN Processor With All Memory on Chip in 28-nm CMOS [J].
Bankman, Daniel ;
Yang, Lita ;
Moons, Bert ;
Verhelst, Marian ;
Murmann, Boris .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) :158-172
[4]   HyperGRAF: Hyperdimensional Graph-based Reasoning Acceleration on FPGA [J].
Chen, Hanning ;
Zakeri, Ali ;
Wen, Fei ;
Barkam, Hamza Errahmouni ;
Imani, Mohsen .
2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2023, :34-41
[5]   CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference [J].
Chen, Zhiyu ;
Yu, Zhanghao ;
Jin, Qing ;
He, Yan ;
Wang, Jingyu ;
Lin, Sheng ;
Li, Dai ;
Wang, Yanzhi ;
Yang, Kaiyuan .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (06) :1924-1935
[6]   A 2941-TOPS/W Charge-Domain 10T SRAM Compute-in-Memory for Ternary Neural Network [J].
Cheon, Sungsoo ;
Lee, Kyeongho ;
Park, Jongsun .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (05) :2085-2097
[7]   An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications [J].
Chih, Yu-Der ;
Lee, Po-Hao ;
Fujiwara, Hidehiro ;
Shih, Yi-Chun ;
Lee, Chia-Fu ;
Naous, Rawan ;
Chen, Yu-Lin ;
Lo, Chieh-Pu ;
Lu, Cheng-Han ;
Mori, Haruki ;
Zhao, Wei-Cheng ;
Sun, Dar ;
Sinangil, Mahmut E. ;
Chen, Yen-Huei ;
Chou, Tan-Li ;
Akarvardar, Kerem ;
Liao, Hung-Jen ;
Wang, Yih ;
Chang, Meng-Fan ;
Chang, Tsung-Yung Jonathan .
2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :252-+
[8]   A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors [J].
Chiu, Yen-Cheng ;
Zhang, Zhixiao ;
Chen, Jia-Jing ;
Si, Xin ;
Liu, Ruhui ;
Tu, Yung-Ning ;
Su, Jian-Wei ;
Huang, Wei-Hsing ;
Wang, Jing-Hong ;
Wei, Wei-Chen ;
Hung, Je-Min ;
Sheu, Shyh-Shyuan ;
Li, Sih-Han ;
Wu, Chih-I ;
Liu, Ren-Shuo ;
Hsieh, Chih-Cheng ;
Tang, Kea-Tiong ;
Chang, Meng-Fan .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) :2790-2801
[9]  
Choi J., 2019, P MACHINE LEARNING S, V1, P348
[10]  
Courbariaux M., 2016, BINARIZED NEURAL NET