An analog-AI chip for energy-efficient speech recognition and transcription

被引：111

作者：

Ambrogio, S. ^{[1
]}

Narayanan, P. ^{[1
]}

Okazaki, A. ^{[2
]}

Fasoli, A. ^{[1
]}

Mackin, C. ^{[1
]}

Hosokawa, K. ^{[2
]}

Nomura, A. ^{[2
]}

Yasuda, T. ^{[2
]}

Chen, A. ^{[1
]}

Friz, A. ^{[1
]}

Ishii, M. ^{[2
]}

Luquin, J. ^{[1
]}

Kohda, Y. ^{[2
]}

Saulnier, N. ^{[3
]}

Brew, K. ^{[3
]}

Choi, S. ^{[3
]}

Ok, I. ^{[3
]}

Philip, T. ^{[3
]}

Chan, V. ^{[3
]}

Silvestre, C. ^{[3
]}

Ahsan, I. ^{[3
]}

Narayanan, V. ^{[4
]}

Tsai, H. ^{[1
]}

Burr, G. W. ^{[1
]}

机构：

[1] IBM Res Almaden, San Jose, CA 95120 USA

[2] IBM Res Tokyo, Kawasaki, Japan

[3] IBM Res Albany, NanoTech Ctr, Albany, NY USA

[4] IBM Thomas J Watson Res Ctr, Yorktown Hts, NY USA

来源：

NATURE | 2023年 / 620卷 / 7975期

关键词：

D O I：

10.1038/s41586-023-06337-5

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Models of artificial intelligence (AI) that have billions of parameters can achieve high accuracy across a range of tasks1,2, but they exacerbate the poor energy efficiency of conventional general-purpose processors, such as graphics processing units or central processing units. Analog in-memory computing (analog-AI)3-7 can provide better energy efficiency by performing matrix-vector multiplications in parallel on 'memory tiles'. However, analog-AI has yet to demonstrate software-equivalent (SWeq) accuracy on models that require many such tiles and efficient communication of neural-network activations between the tiles. Here we present an analog-AI chip that combines 35 million phase-change memory devices across 34 tiles, massively parallel inter-tile communication and analog, low-power peripheral circuitry that can achieve up to 12.4 tera-operations per second per watt (TOPS/W) chip-sustained performance. We demonstrate fully end-to-end SWeq accuracy for a small keyword-spotting network and near-SWeq accuracy on the much larger MLPerf8 recurrent neural-network transducer (RNNT), with more than 45 million weights mapped onto more than 140 million phase-change memory devices across five chips. A low-power chip that runs AI models using analog rather than digital computation shows comparable accuracy on speech-recognition tasks but is more than 14 times as energy efficient.

引用

页码：768 / +

页数：23

共 41 条

[1] Equivalent-accuracy accelerated neural-network training using analogue memory [J].

Ambrogio, Stefano ;

Narayanan, Pritish ;

Tsai, Hsinyu ;

Shelby, Robert M. ;

Boybat, Irem ;

di Nolfo, Carmelo ;

Sidler, Severin ;

Giordano, Massimo ;

Bodini, Martina ;

Farinha, Nathan C. P. ;

Killeen, Benjamin ;

Cheng, Christina ;

Jaoudi, Yassine ;

Burr, Geoffrey W. .

NATURE, 2018, 558 (7708) :60-+

[2]

[Anonymous], 2023, BETT MACH LEARN EV

[3]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[4]

Biswas A, 2018, ISSCC DIG TECH PAP I, P488, DOI 10.1109/ISSCC.2018.8310397

[5]

CHAN WILLIAM., 2021, arXiv, DOI DOI 10.48550/ARXIV.2104.02133

[6] AI hardware acceleration with analog memory: Microarchitectures for low energy at high speed [J].

Chang, H-Y ;

Narayanan, P. ;

Lewis, S. C. ;

Farinha, N. C. P. ;

Hosokawa, K. ;

Mackin, C. ;

Tsai, H. ;

Ambrogio, S. ;

Chen, A. ;

Burr, G. W. .

IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2019, 63 (06)

[7]

Chen GG, 2014, INT CONF ACOUST SPEE

[8] A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices [J].

Xue, Cheng-Xin ;

Hung, Je-Min ;

Kao, Hui-Yao ;

Huang, Yen-Hsiang ;

Huang, Sheng-Po ;

Chang, Fu-Chun ;

Chen, Peng ;

Liu, Ta-Wei ;

Jhang, Chuan-Jia ;

Su, Chin-, I ;

Khwa, Win-San ;

Lo, Chung-Chuan ;

Liu, Ren-Shuo ;

Hsieh, Chih-Cheng ;

Tang, Kea-Tiong ;

Chih, Yu-Der ;

Chang, Tsung-Yung Jonathan ;

Chang, Meng-Fan .

2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :246-+

[9] An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications [J].

Chih, Yu-Der ;

Lee, Po-Hao ;

Fujiwara, Hidehiro ;

Shih, Yi-Chun ;

Lee, Chia-Fu ;

Naous, Rawan ;

Chen, Yu-Lin ;

Lo, Chieh-Pu ;

Lu, Cheng-Han ;

Mori, Haruki ;

Zhao, Wei-Cheng ;

Sun, Dar ;

Sinangil, Mahmut E. ;

Chen, Yen-Huei ;

Chou, Tan-Li ;

Akarvardar, Kerem ;

Liao, Hung-Jen ;

Wang, Yih ;

Chang, Meng-Fan ;

Chang, Tsung-Yung Jonathan .

2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 :252-+

[10] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].

Dahl, George E. ;

Yu, Dong ;

Deng, Li ;

Acero, Alex .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42

← 1 2 3 4 5 →