A Reconfigurable 16Kb AND8T SRAM Macro With Improved Linearity for Multibit Compute-In Memory of Artificial Intelligence Edge Devices

被引:21
作者
Sharma, Vishal [1 ]
Kim, Ju-Eon [1 ]
Kim, Hyunjoon [1 ]
Lu, Lu [1 ,2 ]
Kim, Tony Tae-Hyoung [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] ASTAR, Inst Microelect, Singapore 138634, Singapore
关键词
Random access memory; Voltage; Linearity; Artificial intelligence; SRAM cells; Performance evaluation; Neurons; SRAM; energy-efficiency; bit-precision; multiply-and-accumulate (MAC); compute-in-memory (CIM); ENERGY-EFFICIENT; UNIT-MACRO; ACCELERATOR; INFERENCE; CELL;
D O I
10.1109/JETCAS.2022.3168571
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Compute-in Memory (CIM) has been a promising candidate to perform the energy-efficient multiply-and-accumulate (MAC) operations of the modern Artificial Intelligence (AI) edge devices. This work proposes a multi-bit precision (4b input, 4b weight, and 4b output) 128 x 128 SRAM CIM architecture. The 4b input is implemented using the voltage-scaling and charge-sharing-based scheme. To achieve efficient computation with improved linearity, a novel AND-logic-based 8T SRAM cell (AND8T) is proposed. To address the non-idealities of analog voltage or current-based operations, the proposed AND8T employs the charge-domain-based computation by overlaying a metal-oxide-metal capacitor (MOM cap) with no area overhead. The proposed AND8T mitigates the linearity issue of MAC operations which is highly desirable for the reliable operation of complex neural networks (CNNs). The proposed 16Kb macro asserts 128 inputs in parallel and processes a 128 4b dot-product in a single cycle for the array column (a single neuron). The macro can also be reconfigured for the 64 or 32 4b parallel inputs based on the need of CNN models. The AND8T SRAM macro is fabricated in a 65nm node and achieves an energy efficiency of 301.08 TOPS/W for 16 parallel neurons output, with 128 4b MAC operations at 10MHz clock frequency and 1V supply. The implemented macro supports up to 100MHz of clock frequency and occupies 0.124mm(2) of chip area while achieving the 96.05% and 87% classification accuracy for MNIST and CIFAR-10 datasets.
引用
收藏
页码:522 / 535
页数:14
相关论文
共 40 条
[1]   X-SRAM: Enabling In-Memory Boolean Computations in CMOS Static Random Access Memories [J].
Agrawal, Amogh ;
Jaiswal, Akhilesh ;
Lee, Chankyu ;
Roy, Kaushik .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (12) :4219-4232
[2]   IMAC: In-Memory Multi-Bit Multiplication and ACcumulation in 6T SRAM Array [J].
Ali, Mustafa ;
Jaiswal, Akhilesh ;
Kodge, Sangamesh ;
Agrawal, Amogh ;
Chakraborty, Indranil ;
Roy, Kaushik .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (08) :2521-2531
[3]   Low-Power Computer Vision: Status, Challenges, and Opportunities [J].
Alyamkin, Sergei ;
Ardi, Matthew ;
Berg, Alexander C. ;
Brighton, Achille ;
Chen, Bo ;
Chen, Yiran ;
Cheng, Hsin-Pai ;
Fan, Zichen ;
Feng, Chen ;
Fu, Bo ;
Gauen, Kent ;
Goel, Abhinav ;
Goncharenko, Alexander ;
Guo, Xuyang ;
Ha, Soonhoi ;
Howard, Andrew ;
Hu, Xiao ;
Huang, Yuanjun ;
Kim, Jaeyoun ;
Ko, Jong Gook ;
Kondratyev, Alexander ;
Lee, Junhyeok ;
Lee, Seungjae ;
Lee, Suwoong ;
Li, Zichao ;
Liang, Zhiyu ;
Liu, Juzheng ;
Liu, Xin ;
Lu, Yang ;
Lu, Yung-Hsiang ;
Malik, Deeptanshu ;
Nguyen, Hong Hanh ;
Park, Eunbyung ;
Repin, Denis ;
Shen, Liang ;
Sheng, Tao ;
Sun, Fei ;
Svitov, David ;
Thiruvathukal, George K. ;
Zhang, Baiwu ;
Zhang, Jingchi ;
Zhang, Xiaopeng ;
Zhuo, Shaojie ;
Kang, D. .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :411-421
[4]   CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks [J].
Biswas, Avishek ;
Chandrakasan, Anantha P. .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (01) :217-230
[5]  
Chen WH, 2018, ISSCC DIG TECH PAP I, P494, DOI 10.1109/ISSCC.2018.8310400
[6]   A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices [J].
Xue, Cheng-Xin ;
Hung, Je-Min ;
Kao, Hui-Yao ;
Huang, Yen-Hsiang ;
Huang, Sheng-Po ;
Chang, Fu-Chun ;
Chen, Peng ;
Liu, Ta-Wei ;
Jhang, Chuan-Jia ;
Su, Chin-, I ;
Khwa, Win-San ;
Lo, Chung-Chuan ;
Liu, Ren-Shuo ;
Hsieh, Chih-Cheng ;
Tang, Kea-Tiong ;
Chih, Yu-Der ;
Chang, Tsung-Yung Jonathan ;
Chang, Meng-Fan .
2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :246-+
[7]   An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications [J].
Chih, Yu-Der ;
Lee, Po-Hao ;
Fujiwara, Hidehiro ;
Shih, Yi-Chun ;
Lee, Chia-Fu ;
Naous, Rawan ;
Chen, Yu-Lin ;
Lo, Chieh-Pu ;
Lu, Cheng-Han ;
Mori, Haruki ;
Zhao, Wei-Cheng ;
Sun, Dar ;
Sinangil, Mahmut E. ;
Chen, Yen-Huei ;
Chou, Tan-Li ;
Akarvardar, Kerem ;
Liao, Hung-Jen ;
Wang, Yih ;
Chang, Meng-Fan ;
Chang, Tsung-Yung Jonathan .
2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :252-+
[8]   A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors [J].
Chiu, Yen-Cheng ;
Zhang, Zhixiao ;
Chen, Jia-Jing ;
Si, Xin ;
Liu, Ruhui ;
Tu, Yung-Ning ;
Su, Jian-Wei ;
Huang, Wei-Hsing ;
Wang, Jing-Hong ;
Wei, Wei-Chen ;
Hung, Je-Min ;
Sheu, Shyh-Shyuan ;
Li, Sih-Han ;
Wu, Chih-I ;
Liu, Ren-Shuo ;
Hsieh, Chih-Cheng ;
Tang, Kea-Tiong ;
Chang, Meng-Fan .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) :2790-2801
[9]  
Dong Q, 2020, ISSCC DIG TECH PAP I, P242, DOI [10.1109/ISSCC19947.2020.9062985, 10.1109/isscc19947.2020.9062985]
[10]  
Dong Q, 2017, SYMP VLSI CIRCUITS, pC160, DOI 10.23919/VLSIC.2017.8008465