Optimizing Grouped Convolutions on Edge Devices

被引:17
作者
Gibson, Perry [1 ]
Cano, Jose [1 ]
Turner, Jack [2 ]
Crowley, Elliot J. [2 ]
O'Boyle, Michael [2 ]
Storkey, Amos [2 ]
机构
[1] Univ Glasgow, Glasgow, Lanark, Scotland
[2] Univ Edinburgh, Edinburgh, Midlothian, Scotland
来源
2020 IEEE 31ST INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 2020) | 2020年
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/ASAP49362.2020.00039
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
When deploying a deep neural network on constrained hardware, it is possible to replace the network's standard convolutions with grouped convolutions. This allows for substantial memory savings with minimal loss of accuracy. However, current implementations of grouped convolutions in modern deep learning frameworks are far from performing optimally in terms of speed. In this paper we propose Grouped Spatial Pack Convolutions (GSPC), a new implementation of grouped convolutions that outperforms existing solutions. We implement GSPC in TVM, which provides state-of-the-art performance on edge devices. We analyze a set of networks utilizing different types of grouped convolutions and evaluate their performance in terms of inference time on several edge devices. We observe that our new implementation scales well with the number of groups and provides the best inference times in all settings, improving the existing implementations of grouped convolutions in TVM, PyTorch and TensorFlow Lite by 3.4x, 8x and 4x on average respectively. Code is available at https://github.com/gecLAB/tvm-GSPC/
引用
收藏
页码:189 / 196
页数:8
相关论文
共 31 条
[1]  
[Anonymous], 2017, ICLR 2017
[2]  
[Anonymous], 2009, Tech. Rep. TR-2009
[3]  
[Anonymous], 2014, ARXIV NEURAL EVOLUTI
[4]  
[Anonymous], 2018, IEEE C COMP VIS PATT
[5]  
[Anonymous], 2014, Ecole Polytechnique
[6]  
[Anonymous], 2016, P BRIT MACHINE VISIO
[7]  
Baghdadi R, 2019, INT SYM CODE GENER, P193, DOI [10.1109/CGO.2019.8661197, 10.5281/zenodo.2375075]
[8]   Machine Learning Systems are Stuck in a Rut [J].
Barham, Paul ;
Isard, Michael .
PROCEEDINGS OF THE WORKSHOP ON HOT TOPICS IN OPERATING SYSTEMS (HOTOS '19), 2019, :177-183
[9]  
Chen Tian Qi, 2018, ADV NEURAL INFORM PR
[10]  
Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579