Machine Learning Hardware Design for Efficiency, Flexibility, and Scalability [Feature]

被引：1

作者：

Zhang, Jie-Fang ^{[1
]}

Zhang, Zhengya ^{[1
]}

机构：

[1] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA

来源：

IEEE CIRCUITS AND SYSTEMS MAGAZINE | 2023年 / 23卷 / 03期

关键词：

Surveys; Scalability; Multichip modules; Artificial neural networks; Machine learning; Bandwidth; Tutorials; Design engineering; Hardware design languages; ML hardware; DNN accelerator; sparse DNN architecture; DNN chiplet; heterogeneous integration; DEEP NEURAL-NETWORKS; ACCELERATION; SPARSE;

D O I：

10.1109/MCAS.2023.3302390

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The widespread use of deep neural networks (DNNs) and DNN-based machine learning (ML) methods justifies DNN computation as a workload class itself. Beginning with a brief review of DNN workloads and computation, we provide an overview of single instruction multiple data (SIMD) and systolic array architectures. These two basic architectures support the kernel operations for DNN computation, and they form the core of many flexible DNN accelerators. To enable a higher performance and efficiency, sparse DNN hardware can be designed to gain from data sparsity. We present common approaches from compressed storage to processing sparse data to reduce memory and bandwidth usage and improve energy efficiency and performance. To accommodate the fast evolution of new models of larger size and higher complexity, modular chiplet integration can be a promising path to meet the growing needs. We show recent work on homogeneous tiling and heterogeneous integration to scale up and scale out hardware to support larger models of more complex functions.

引用

页码：35 / 53

页数：19

共 58 条

[1] Bit-Pragmatic Deep Neural Network Computing
Albericio, Jorge
Delmas, Alberto
Judd, Patrick
Sharify, Sayeh
O'Leary, Gerard
Genov, Roman
Moshovos, Andreas
[J]. 50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 382 - 394
[2] Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing
Albericio, Jorge
Judd, Patrick
Hetherington, Tayler
Aamodt, Tor
Jerger, Natalie Enright
Moshovos, Andreas
[J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 1 - 13
[3] Structured Pruning of Deep Convolutional Neural Networks
Anwar, Sajid
Hwang, Kyuyeon
Sung, Wonyong
[J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[4] Benchmark Analysis of Representative Deep Neural Network Architectures
Bianco, Simone
Cadene, Remi
Celona, Luigi
Napoletano, Paolo
[J]. IEEE ACCESS, 2018, 6 : 64270 - 64277
[5] Brown TB, 2020, ADV NEUR IN, V33
[6] Canziani A, 2017, arXiv, DOI [10.48550/arXiv.1605.07678, DOI 10.48550/ARXIV.1605.07678]
[7] Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
Chen, Yu-Hsin
Yange, Tien-Ju
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) : 292 - 308
[8] Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks
Chen, Yu-Hsin
Krishna, Tushar
Emer, Joel S.
Sze, Vivienne
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (01) : 127 - 138
[9] DaDianNao: A Machine-Learning Supercomputer
Chen, Yunji
Luo, Tao
Liu, Shaoli
Zhang, Shijin
He, Liqiang
Wang, Jia
Li, Ling
Chen, Tianshi
Xu, Zhiwei
Sun, Ninghui
Temam, Olivier
[J]. 2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 609 - 622
[10] Cho S.-G., 2021, PROC S VLSI CIRCUITS, P1

← 1 2 3 4 5 6 →