New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and Inference

被引：29

作者：

Zhang, Hao ^{[1
]}

Chen, Dongdong ^{[2
]}

Ko, Seok-Bum ^{[1
]}

机构：

[1] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK S7N 5A2, Canada

[2] Intel Corp, San Jose, CA 95134 USA

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2020年 / 69卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Neural networks; Standards; Deep learning; Training; Hardware; Adders; Pipelines; Multiply-accumulate unit; multiple-precision arithmetic; flexible precision arithmetic; deep neural network computing; computer arithmetic; ADD;

D O I：

10.1109/TC.2019.2936192

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a new flexible multiple-precision multiply-accumulate (MAC) unit is proposed for deep neural network training and inference. The proposed MAC unit supports both fixed-point operations and floating-point operations. For floating-point format, the proposed unit supports one 16-bit MAC operation or sum of two 8-bit multiplications plus a 16-bit addend. To make the proposed MAC unit more versatile, the bit-width of exponent and mantissa can be flexibly exchanged. By setting the bit-width of exponent to zero, the proposed MAC unit also supports fixed-point operations. For fixed-point format, the proposed unit supports one 16-bit MAC or sum of two 8-bit multiplications plus a 16-bit addend. Moreover, the proposed unit can be further divided to support sum of four 4-bit multiplications plus a 16-bit addend. At the lowest precision, the proposed MAC unit supports accumulating of eight 1-bit logic AND operations to enable the support of binary neural networks. Compared to the standard 16-bit half-precision MAC unit, the proposed MAC unit provides more flexibility with only 21.8 percent area overhead. Compared to a standard 32-bit single-precision MAC unit, the proposed MAC unit requires much less hardware cost but still provides 8-bit exponent in the numerical format to maintain large dynamic range for deep learning computing.

引用

页码：26 / 38

页数：13

共 50 条

[21] PV-MAC: Multiply-and-accumulate unit structure exploiting precision variability in on-device convolutional neural networks
Kang, Jongsung
Kim, Taewhan
INTEGRATION-THE VLSI JOURNAL, 2020, 71 : 76 - 85
[22] Efficient spiking neural network training and inference with reduced precision memory and computing
Wang, Yi
Shahbazi, Karim
Zhang, Hao
Oh, Kwang-Il
Lee, Jae-Jin
Ko, Seok-Bum
IET COMPUTERS AND DIGITAL TECHNIQUES, 2019, 13 (05): : 397 - 404
[23] Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration
Xin, Jie
Ye, Xianqi
Zheng, Long
Wang, Qinggang
Huang, Yu
Yao, Pengcheng
Yu, Linchen
Liao, Xiaofei
Jin, Hai
2021 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2021,
[24] FloatPIM: In-Memory Acceleration of Deep Neural Network Training with High Precision
Imani, Mohsen
Gupta, Saransh
Kim, Yeseong
Rosing, Tajana
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 802 - 815
[25] Efficient deep neural network training via decreasing precision with layer capacity
Shen, Ao
Lai, Zhiquan
Sun, Tao
Li, Shengwei
Ge, Keshi
Liu, Weijie
Li, Dongsheng
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (10)
[26] LDD: High-Precision Training of Deep Spiking Neural Network Transformers Guided by an Artificial Neural Network
Liu, Yuqian
Zhao, Chujie
Jiang, Yizhou
Fang, Ying
Chen, Feng
BIOMIMETICS, 2024, 9 (07)
[27] A training method for deep neural network inference accelerators with high tolerance for their hardware imperfection
Gao, Shuchao
Ohsawa, Takashi
JAPANESE JOURNAL OF APPLIED PHYSICS, 2024, 63 (02)
[28] Mixed-precision Deep Neural Network Quantization With Multiple Compression Rates
Wang, Xuanda
Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai
2023 DATA COMPRESSION CONFERENCE, DCC, 2023, : 371 - 371
[29] Training deep neural network on multiple GPUs with a model averaging method
Qiongjie Yao
Xiaofei Liao
Hai Jin
Peer-to-Peer Networking and Applications, 2018, 11 : 1012 - 1021
[30] Fast training algorithm for deep neural network using multiple GPUs
Dai, L. (lrdai@ustc.edu.cn), 1600, Tsinghua University (53):

← 1 2 3 4 5 →