Activity-difference training of deep neural networks using memristor crossbars

被引：70

作者：

Yi, Su-in ^{[1
]}

Kendall, Jack D. ^{[2
]}

Williams, R. Stanley ^{[1
]}

Kumar, Suhas ^{[3
]}

机构：

[1] Texas A&M Univ, College Stn, TX USA

[2] Rain Neuromorph, San Francisco, CA USA

[3] Sandia Natl Labs, Livermore, CA 94551 USA

来源：

NATURE ELECTRONICS | 2023年 / 6卷 / 01期

基金：

美国国家科学基金会;

关键词：

BACKPROPAGATION;

D O I：

10.1038/s41928-022-00869-w

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Artificial neural networks have rapidly progressed in recent years, but are limited by the high energy costs required to train them on digital hardware. Emerging analogue hardware, such as memristor arrays, could offer improved energy efficiencies. However, the widely used backpropagation training algorithms are generally incompatible with such hardware because of mismatches between the analytically calculated training information and the imprecision of actual analogue devices. Here we report activity-difference-based training on co-designed tantalum oxide analogue memristor crossbars. Our approach, which we term memristor activity-difference energy minimization, treats the network parameters as a constrained optimization problem, and numerically calculates local gradients via Hopfield-like energy minimization using behavioural differences in the hardware targeted by the training. We use the technique to train one-layer and multilayer neural networks that can classify Braille words with high accuracy. With modelling, we show that our approach can offer over four orders of magnitude energy advantage compared with digital approaches for scaled-up problem sizes. An activity-difference training approach, which employs 64 x 64 memristor arrays with integrated complementary metal-oxide-semiconductor control circuitry, can be used to train a deep neural network to efficiently classify Braille words.

引用

页码：45 / 51

页数：7

共 49 条

[1]

ACKLEY DH, 1985, COGNITIVE SCI, V9, P147

[2] Equivalent-accuracy accelerated neural-network training using analogue memory [J].

Ambrogio, Stefano ;

Narayanan, Pritish ;

Tsai, Hsinyu ;

Shelby, Robert M. ;

Boybat, Irem ;

di Nolfo, Carmelo ;

Sidler, Severin ;

Giordano, Massimo ;

Bodini, Martina ;

Farinha, Nathan C. P. ;

Killeen, Benjamin ;

Cheng, Christina ;

Jaoudi, Yassine ;

Burr, Geoffrey W. .

NATURE, 2018, 558 (7708) :60-+

[3]

Bai S., 2019, ADV NEUR IN

[4]

Bai Shaojie, 2020, ADV NEUR IN, V33

[5] On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? [J].

Bender, Emily M. ;

Gebru, Timnit ;

McMillan-Major, Angelina ;

Shmitchell, Shmargaret .

PROCEEDINGS OF THE 2021 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2021, 2021, :610-623

[6] Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks [J].

Cai, Fuxi ;

Kumar, Suhas ;

Van Vaerenbergh, Thomas ;

Sheng, Xia ;

Liu, Rui ;

Li, Can ;

Liu, Zhan ;

Foltin, Martin ;

Yu, Shimeng ;

Xia, Qiangfei ;

Yang, J. Joshua ;

Beausoleil, Raymond ;

Lu, Wei D. ;

Strachan, John Paul .

NATURE ELECTRONICS, 2020, 3 (07) :409-418

[7] Surrogate gradients for analog neuromorphic computing [J].

Cramer, Benjamin ;

Billaudelle, Sebastian ;

Kanya, Simeon ;

Leibfried, Aron ;

Grubl, Andreas ;

Karasenko, Vitali ;

Pehle, Christian ;

Schreiber, Korbinian ;

Stradmann, Yannik ;

Weis, Johannes ;

Schemmel, Johannes ;

Zenke, Friedemann .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (04)

[8] Synaptic Resistors for Concurrent Inference and Learning with High Energy Efficiency [J].

Danesh, Cameron D. ;

Shaffer, Christopher M. ;

Nathan, Dhruva ;

Shenoy, Rahul ;

Tudor, Andrew ;

Tadayon, Macan ;

Lin, Yvette ;

Chen, Yong .

ADVANCED MATERIALS, 2019, 31 (18)

[9] Demonstration of Decentralized Physics-Driven Learning [J].

Dillavou, Sam ;

Stern, Menachem ;

Liu, Andrea J. ;

Durian, Douglas J. .

PHYSICAL REVIEW APPLIED, 2022, 18 (01)

[10]

Ernoult M, 2019, ADV NEUR IN, V32

← 1 2 3 4 5 →