A 40-nm MLC-RRAM Compute-in-Memory Macro With Sparsity Control, On-Chip Write-Verify, and Temperature-Independent ADC References

被引：38

作者：

Li, Wantong ^{[1
]}

Sun, Xiaoyu ^{[2
]}

Huang, Shanshi ^{[1
]}

Jiang, Hongwu ^{[1
]}

Yu, Shimeng ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA

[2] Taiwan Semicond Mfg Co TSMC, San Jose, CA 95132 USA

来源：

IEEE JOURNAL OF SOLID-STATE CIRCUITS | 2022年 / 57卷 / 09期

关键词：

System-on-chip; Resistance; Common Information Model (computing); Sensors; Nonvolatile memory; Programming; Quantization (signal); Emerging non-volatile memories (NVMs); hardware accelerators; in-memory computing; machine learning; MONOLITHICALLY INTEGRATED RRAM; INFERENCE; CMOS;

D O I：

10.1109/JSSC.2022.3163197

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Resistive random access memory (RRAM)-based compute-in-memory (CIM) has shown great potential for accelerating deep neural network (DNN) inference. However, device characteristics, such as low-resistance values, susceptibility to drift, and single-level cells, may limit the capabilities of RRAM-based CIM. In addition, prior works generally used the off-chip write-verify scheme to tighten RRAM resistance distributions and used off-chip analog-to-digital converter (ADC) references for fine-tuning partial sum quantization. Although off-chip techniques are viable for testing purposes, they may be unsuitable for practical applications. In this work, we present an RRAM-CIM macro to accelerate DNN inference. The chip features: 1) multi-level cell (MLC) RRAM for improving compute performance and density; 2) sparsity-aware input control to leverage the high activation sparsity in DNN models; 3) on-chip write-verify to speed up initial weight programming and periodically refresh cells to compensate for resistance drift under stress; and 4) on-chip ADC reference generation that provides column-wise tunability and stability with varying temperatures to guarantee the CIFAR-10 accuracy of 85.8% at 120 degrees C. The design is fabricated in TSMC 40-nm process with embedded RRAM technology and achieves a macro-level peak performance of 97.8 GOPS/mm(2) and 44.5 TOPS/W for multiply-and-accumulate (MAC) operations on VGG-8 network with ternary weights.

引用

页码：2868 / 2877

页数：10

共 37 条

[1] BRein Memory: A Single-Chip Binary/Ternary Reconfigurable in-Memory Deep Neural Network Accelerator Achieving 1.4 TOPS at 0.6 W [J].

Ando, Kota ;

Ueyoshi, Kodai ;

Orimo, Kentaro ;

Yonekawa, Haruyoshi ;

Sato, Shimpei ;

Nakahara, Hiroki ;

Takamaeda-Yamazaki, Shinya ;

Ikebe, Masayuki ;

Asai, Tetsuya ;

Kuroda, Tadahiro ;

Motomura, Masato .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (04) :983-994

[2]

Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007

[3] A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices [J].

Xue, Cheng-Xin ;

Hung, Je-Min ;

Kao, Hui-Yao ;

Huang, Yen-Hsiang ;

Huang, Sheng-Po ;

Chang, Fu-Chun ;

Chen, Peng ;

Liu, Ta-Wei ;

Jhang, Chuan-Jia ;

Su, Chin-, I ;

Khwa, Win-San ;

Lo, Chung-Chuan ;

Liu, Ren-Shuo ;

Hsieh, Chih-Cheng ;

Tang, Kea-Tiong ;

Chih, Yu-Der ;

Chang, Tsung-Yung Jonathan ;

Chang, Meng-Fan .

2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :246-+

[4] A Fully Integrated Reprogrammable CMOS-RRAM Compute-in-Memory Coprocessor for Neuromorphic Applications [J].

Correll, Justin M. ;

Bothra, Vishishtha ;

Cai, Fuxi ;

Lim, Yong ;

Lee, Seung Hwan ;

Lee, Seungjong ;

Lu, Wei D. ;

Zhang, Zhengya ;

Flynn, Michael P. .

IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2020, 6 (01) :36-44

[5]

Dong Q, 2020, ISSCC DIG TECH PAP I, P242, DOI [10.1109/ISSCC19947.2020.9062985, 10.1109/isscc19947.2020.9062985]

[6] 2-Bit-Per-Cell RRAM-Based In-Memory Computing for Area-/Energy-Efficient Deep Learning [J].

He, Wangxin ;

Yin, Shihui ;

Kim, Yulhwa ;

Sun, Xiaoyu ;

Kim, Jae-Joon ;

Yu, Shimeng ;

Seo, Jae-Sun .

IEEE SOLID-STATE CIRCUITS LETTERS, 2020, 3 :194-197

[7] Analog-to-Digital Converter Design Exploration for Compute-in-Memory Accelerators [J].

Jiang, Hongwu ;

Li, Wantong ;

Huang, Shanshi ;

Cosemans, Stefan ;

Catthoor, Francky ;

Yu, Shimeng .

IEEE DESIGN & TEST, 2022, 39 (02) :48-55

[8]

Jiang ZW, 2018, 2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, P173, DOI 10.1109/VLSIT.2018.8510687

[9] In-Datacenter Performance Analysis of a Tensor Processing Unit [J].

Jouppi, Norman P. ;

Young, Cliff ;

Patil, Nishant ;

Patterson, David ;

Agrawal, Gaurav ;

Bajwa, Raminder ;

Bates, Sarah ;

Bhatia, Suresh ;

Boden, Nan ;

Borchers, Al ;

Boyle, Rick ;

Cantin, Pierre-luc ;

Chao, Clifford ;

Clark, Chris ;

Coriell, Jeremy ;

Daley, Mike ;

Dau, Matt ;

Dean, Jeffrey ;

Gelb, Ben ;

Ghaemmaghami, Tara Vazir ;

Gottipati, Rajendra ;

Gulland, William ;

Hagmann, Robert ;

Ho, C. Richard ;

Hogberg, Doug ;

Hu, John ;

Hundt, Robert ;

Hurt, Dan ;

Ibarz, Julian ;

Jaffey, Aaron ;

Jaworski, Alek ;

Kaplan, Alexander ;

Khaitan, Harshit ;

Killebrew, Daniel ;

Koch, Andy ;

Kumar, Naveen ;

Lacy, Steve ;

Laudon, James ;

Law, James ;

Le, Diemthu ;

Leary, Chris ;

Liu, Zhuyuan ;

Lucke, Kyle ;

Lundin, Alan ;

MacKean, Gordon ;

Maggiore, Adriana ;

Mahony, Maire ;

Miller, Kieran ;

Nagarajan, Rahul ;

Narayanaswami, Ravi .

44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017), 2017, :1-12

[10] A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array [J].

Kang, Mingu ;

Gonugondla, Sujan K. ;

Patil, Ameya ;

Shanbhag, Naresh R. .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (02) :642-655

← 1 2 3 4 →