ERIDANUS: Efficiently Running Inference of DNNs Using Systolic Arrays

被引:22
作者
Asgari, Bahar [1 ]
Hadidi, Ramyad [2 ]
Kim, Hyesoon [3 ]
Yalamanehili, Sudhakar [4 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Comp Sci, Atlanta, GA 30332 USA
[3] Georgia Inst Technol, Sch Comp Sci, Atlanta, GA 30332 USA
[4] Georgia Inst Technol, Sch Elect & Comp Engn, Comp Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
deep neural network; Efficient Inference; Pruning; Systolic arrays;
D O I
10.1109/MM.2019.2930057
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Systolic arrays with promising attributes, such as high degree of concurrent computation and high data-reuse rate, are attractive solutions for dense linear algebra. Recently, systolic arrays have been used for accelerating the inference of deep neural networks (DNNs). However, as sparsification mechanisms are applied to DNNs during or after training, DNN inference is usually a sparse problem. Therefore, it cannot fully benefit from the fundamental advantages offered by systolic arrays. To solve this challenge, we propose Eridanus, an approach to structured pruning that produces DNNs compatible with the synchronous and rhythmic flow of data from memory to systolic arrays.
引用
收藏
页码:46 / 54
页数:9
相关论文
共 12 条
[1]  
[Anonymous], 2016, INT C LEARNING REPRE
[2]  
[Anonymous], 2014, ARXIV NEURAL EVOLUTI
[3]  
[Anonymous], GOING FULL UTILIZATI
[4]  
[Anonymous], 2019, P 56 ANN DES AUT C 2
[5]   Structured Pruning of Deep Convolutional Neural Networks [J].
Anwar, Sajid ;
Hwang, Kyuyeon ;
Sung, Wonyong .
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
[6]  
Han S, 2016, ICLR 2016 INT C LEAR
[7]  
Karras T., 2016, ARXIV161106440
[8]   Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization [J].
Kung, H. T. ;
McDanel, Bradley ;
Zhang, Sai Qian .
TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, :821-834
[9]   Fast Algorithms for Convolutional Neural Networks [J].
Lavin, Andrew ;
Gray, Scott .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4013-4021
[10]  
Mao Huizi, 2017, ARXIV PREPRINT ARXIV