A Dual 7T SRAM-Based Zero-Skipping Compute-In-Memory Macro With 1-6b Binary Searching ADCs for Processing Quantized Neural Networks

被引：1

作者：

Yu, Chengshuo ^{[1
]}

Jiang, Haoge ^{[2
]}

Mu, Junjie ^{[2
]}

Chai, Kevin Tshun Chuan ^{[3
]}

Kim, Tony Tae-Hyoung ^{[2
]}

Kim, Bongjin ^{[4
]}

机构：

[1] Zhangjiang Lab, Shanghai 201210, Peoples R China

[2] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

[3] ASTAR, Inst Microelect, Singapore 138634, Singapore

[4] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2024年 / 71卷 / 08期

关键词：

Voltage; In-memory computing; Random access memory; Neural networks; MOSFET; Energy efficiency; Computational modeling; SRAM; compute-in-memory (CIM); quantized neural networks; multiply and accumulate (MAC); zero-skipping; current-mode accumulation; UNIT-MACRO; PRECISION; WEIGHT; CHIP;

D O I：

10.1109/TCSI.2024.3411608

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This article presents a novel dual 7T static random-access memory (SRAM)-based compute-in-memory (CIM) macro for processing quantized neural networks. The proposed SRAM-based CIM macro decouples read/write operations and employs a zero-input/weight skipping scheme. A 65nm test chip with 528x128 integrated dual 7T bitcells demonstrated reconfigurable precision multiply and accumulate operations with 384 x binary inputs (0/1) and 384 x128 programmable multi-bit weights (3/7/15-levels). Each column comprises 384 x bitcells for a dot product, 48 x bitcells for offset calibration, and 96 x bitcells for binary-searching analog-to-digital conversion. The analog-to-digital converter (ADC) converts a voltage difference between two read bitlines (i.e., an analog dot-product result) to a 1-6b digital output code using binary searching in 1-6 conversion cycles using replica bitcells. The test chip with 66Kb embedded dual SRAM bitcells was evaluated for processing neural networks, including the MNIST image classifications using a multi-layer perceptron (MLP) model with its layer configuration of 784-256-256-256-10 The measured classification accuracies are 97.62%, 97.65%, and 97.72% for the 3, 7, and 15 level weights, respectively. The accuracy degradations are only 0.58 to 0.74% off the baseline with software simulations. For the VGG6 model using the CIFAR-10 image dataset, the accuracies are 88.59%, 88.21%, and 89.07% for the 3, 7, and 15 level weights, with degradations of only 0.6 to 1.32% off the software baseline. The measured energy efficiencies are 258.5, 67.9, and 23.9 TOPS/W for the 3, 7, and 15 level weights, respectively, measured at 0.45/0.8V supplies.

引用

页码：3672 / 3682

页数：11

共 48 条

[1] BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W [J].

Ando, Kota ;

Ueyoshi, Kodai ;

Orimo, Kentaro ;

Yonekawa, Haruyoshi ;

Sato, Shimpei ;

Nakahara, Hiroki ;

Takamaeda-Yamazaki, Shinya ;

Ikebe, Masayuki ;

Asai, Tetsuya ;

Kuroda, Tadahiro ;

Motomura, Masato .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (04) :983-994

[2]

[Anonymous], 2011, Lessons Learned From Manually Classi- fying CIFAR-10

[3] An Always-On 3.8 μJ/86% CIFAR-10 Mixed-Signal Binary CNN Processor With All Memory on Chip in 28-nm CMOS [J].

Bankman, Daniel ;

Yang, Lita ;

Moons, Bert ;

Verhelst, Marian ;

Murmann, Boris .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) :158-172

[4] HyperGRAF: Hyperdimensional Graph-based Reasoning Acceleration on FPGA [J].

Chen, Hanning ;

Zakeri, Ali ;

Wen, Fei ;

Barkam, Hamza Errahmouni ;

Imani, Mohsen .

2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2023, :34-41

[5] CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference [J].

Chen, Zhiyu ;

Yu, Zhanghao ;

Jin, Qing ;

He, Yan ;

Wang, Jingyu ;

Lin, Sheng ;

Li, Dai ;

Wang, Yanzhi ;

Yang, Kaiyuan .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (06) :1924-1935

[6] A 2941-TOPS/W Charge-Domain 10T SRAM Compute-in-Memory for Ternary Neural Network [J].

Cheon, Sungsoo ;

Lee, Kyeongho ;

Park, Jongsun .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (05) :2085-2097

[7] An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications [J].

Chih, Yu-Der ;

Lee, Po-Hao ;

Fujiwara, Hidehiro ;

Shih, Yi-Chun ;

Lee, Chia-Fu ;

Naous, Rawan ;

Chen, Yu-Lin ;

Lo, Chieh-Pu ;

Lu, Cheng-Han ;

Mori, Haruki ;

Zhao, Wei-Cheng ;

Sun, Dar ;

Sinangil, Mahmut E. ;

Chen, Yen-Huei ;

Chou, Tan-Li ;

Akarvardar, Kerem ;

Liao, Hung-Jen ;

Wang, Yih ;

Chang, Meng-Fan ;

Chang, Tsung-Yung Jonathan .

2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :252-+

[8] A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors [J].

Chiu, Yen-Cheng ;

Zhang, Zhixiao ;

Chen, Jia-Jing ;

Si, Xin ;

Liu, Ruhui ;

Tu, Yung-Ning ;

Su, Jian-Wei ;

Huang, Wei-Hsing ;

Wang, Jing-Hong ;

Wei, Wei-Chen ;

Hung, Je-Min ;

Sheu, Shyh-Shyuan ;

Li, Sih-Han ;

Wu, Chih-I ;

Liu, Ren-Shuo ;

Hsieh, Chih-Cheng ;

Tang, Kea-Tiong ;

Chang, Meng-Fan .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) :2790-2801

[9]

Choi J., 2019, P MACHINE LEARNING S, V1, P348

[10]

Courbariaux M., 2016, BINARIZED NEURAL NET

← 1 2 3 4 5 →