An Energy-Efficient High CSNR XNOR and Accumulation Scheme For BNN

被引：7

作者：

Kushwaha, Dinesh ^{[1
]}

Joshi, Ashish ^{[2
]}

Kumar, Chaudhry Indra ^{[3
]}

Gupta, Neha ^{[1
]}

Miryala, Sandeep ^{[4
]}

Joshi, Rajiv, V ^{[5
]}

Dasgupta, Sudeb ^{[1
]}

Bulusu, Anand ^{[1
]}

机构：

[1] Indian Inst Technol Roorkee, Dept Elect & Commun Engn, Roorkee 247667, Uttar Pradesh, India

[2] Intel Technol India Pvt Ltd, IP Engn Grp IPG, Bengaluru 560103, India

[3] Delhi Technol Univ, Dept Elect Engn, New Delhi 110042, India

[4] Brookhaven Natl Lab, Instrumentat Div, Upton, NY 11973 USA

[5] Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS | 2022年 / 69卷 / 04期

关键词：

Accuracy; accumulation; artificial intelligence (AI); compute signal margin (CSM); compute signal to noise ratio (CSNR); energy-efficiency; latency; neuron; SRAM; COMPUTING SRAM MACRO; IN-MEMORY MACRO; CNN ACCELERATOR;

D O I：

10.1109/TCSII.2022.3149818

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this brief, we present an energy-efficient and high compute signal-to-noise ratio (CSNR) XNOR and accumulation (XAC) scheme for binary neural networks (BNNs). Transmission gates achieve a large compute signal margin (CSM) and high CSNR for accurate XAC operation. The 10T1C XNOR SRAM bit-cell performs the in-memory XAC operation without pre-charging the larger bitline capacitances and significantly reducing energy consumption per XAC operation. The validation of the proposed XAC scheme is done through the post-layout simulations in 65nm CMOS technology with V-DD of 1 V. The achieved 1 ns of latency and 2.36 fJ of energy consumption per XAC operation are (7.2 x , 7.2 x ) and (2 x , 1.31 x ) lower than state-of-the-art digital and analog compute in-memory (CIM) XAC schemes respectively. The proposed XAC design achieves 8.6 x improvement in figure-of-merit (FoM), over prior state-of-the-art. Moreover, (sigma/mu) average of 0.2% from Monte Carlo simulations show that proposed XAC scheme is robust against systematic mismatch and process variations.

引用

页码：2311 / 2315

页数：5

共 17 条

[11]

Si X, 2019, ISSCC DIG TECH PAP I, V62, P396, DOI 10.1109/ISSCC.2019.8662392

[12] A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute [J].

Valavi, Hossein ;

Ramadge, Peter J. ;

Nestler, Eric ;

Verma, Naveen .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2019, 54 (06) :1789-1799

[13]

Yang J, 2019, ISSCC DIG TECH PAP I, V62, P394, DOI 10.1109/ISSCC.2019.8662435

[14] XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks [J].

Yin, Shihui ;

Jiang, Zhewei ;

Seo, Jae-Sun ;

Seok, Mingoo .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (06) :1733-1743

[15] Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks [J].

Yin, Shihui ;

Jiang, Zhewei ;

Kim, Minkyu ;

Gupta, Tushar ;

Seok, Mingoo ;

Seo, Jae-Sun .

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) :48-61

[16] A 55nm, 0.4V 5526-TOPS/W Compute-in-Memory Binarized CNN Accelerator for AIoT Applications [J].

Zhang, Hongtu ;

Shu, Yuhao ;

Jiang, Weixiong ;

Yin, Zihan ;

Zhao, Wenfeng ;

Ha, Yajun .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (05) :1695-1699

[17] In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array [J].

Zhang, Jintao ;

Wang, Zhuo ;

Verma, Naveen .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (04) :915-924

← 1 2 →