GhostNets on Heterogeneous Devices via Cheap Operations

被引：78

作者：

Han, Kai ^{[1
,2
]}

Wang, Yunhe ^{[2
]}

Xu, Chang ^{[3
]}

Guo, Jianyuan ^{[2
,3
]}

Xu, Chunjing ^{[2
]}

Wu, Enhua ^{[1
,4
]}

Tian, Qi ^{[2
]}

机构：

[1] Univ Chinese Acad Sci, State Key Lab Comp Sci, ISCAS, Beijing, Peoples R China

[2] Huawei Noahs Ark Lab, Shenzhen, Peoples R China

[3] Univ Sydney, Sydney, NSW, Australia

[4] Univ Macau, Macau, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2022年 / 130卷 / 04期

基金：

澳大利亚研究理事会;

关键词：

Convolutional neural networks; Efficient inference; Visual recognition;

D O I：

10.1007/s11263-022-01575-y

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deploying convolutional neural networks (CNNs) on mobile devices is difficult due to the limited memory and computation resources. We aim to design efficient neural networks for heterogeneous devices including CPU and GPU, by exploiting the redundancy in feature maps, which has rarely been investigated in neural architecture design. For CPU-like devices, we propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Based on a set of intrinsic feature maps, we apply a series of linear transformations with cheap cost to generate many ghost feature maps that could fully reveal information underlying intrinsic features. The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks. C-Ghost bottlenecks are designed to stack C-Ghost modules, and then the lightweight C-GhostNet can be easily established. We further consider the efficient networks for GPU devices. Without involving too many GPU-inefficient operations (e.g., depth-wise convolution) in a building stage, we propose to utilize the stage-wise feature redundancy to formulate GPU-efficient Ghost (G-Ghost) stage structure. The features in a stage are split into two parts where the first part is processed using the original block with fewer output channels for generating intrinsic features, and the other are generated using cheap operations by exploiting stage-wise redundancy. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively.

引用

页码：1050 / 1069

页数：20

共 93 条

[1] Abadi M, 2015, TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems
[2] Cai H, 2019, ICLR, P1, DOI DOI 10.48550/ARXIV.1812.00332
[3] AdderNet: Do We Really Need Multiplications in Deep Learning?
Chen, Hanting
Wang, Yunhe
Xu, Chunjing
Shi, Boxin
Xu, Chao
Tian, Qi
Xu, Chang
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1465 - 1474
[4] Data-Free Learning of Student Networks
Chen, Hanting
Wang, Yunhe
Xu, Chang
Yang, Zhaohui
Liu, Chuanjian
Shi, Boxin
Xu, Chunjing
Xu, Chao
Tian, Qi
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3513 - 3521
[5] Chen K., 2019, ARXIV190607155
[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[7] Chen W., 2020, P ICLR
[8] All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
Chen, Weijie
Xie, Di
Zhang, Yuan
Pu, Shiliang
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7234 - 7243
[9] Towards Efficient Model Compression via Learned Global Ranking
Chin, Ting-Wu
Ding, Ruizhou
Zhang, Cha
Marculescu, Diana
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1515 - 1525
[10] Chollet F., 2016, IEEE C COMP VIS PATT, P1251, DOI [DOI 10.1109/CVPR.2017.195, 10.48550/ARXIV.1610.02357]

← 1 2 3 4 5 6 7 8 9 10 →