Dynamic Split Computing-Aware Mixed-Precision Quantization for Efficient Deep Edge Intelligence

被引：0

作者：

Nagamatsu, Naoki ^{[1
]}

Hara-Azumi, Yuko ^{[1
]}

机构：

[1] Tokyo Inst Technol, Meguro Ku, Tokyo 1528552, Japan

来源：

2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023 | 2024年

关键词：

Deep Neural Networks; Split Computing; Mixed-Precision Quantization; Neural Architecture Search;

D O I：

10.1109/TrustCom60117.2023.00355

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deploying large deep neural networks (DNNs) on IoT and mobile devices poses a significant challenge due to hardware resource limitations. To address this challenge, an edge-cloud integration technique, called split computing (SC), is attractive in improving the inference time by splitting a single DNN model into two sub-models to be processed on an edge device and a server. Dynamic split computing (DSC) is a further emerging technique in SC to dynamically determine the split point depending on the communication conditions. In this work, we propose a DNN architecture optimization method for DSC. Our contributions are twofold. (1) First, we develop a DSC-aware mixed-precision quantization method exploiting neural architecture search (NAS). By NAS, we efficiently explore the optimal bitwidth of each layer from a huge design space to construct potential split points in the target DNN - with the more potential split points, the DNN architecture can more flexibly utilize one split point depending on the communication conditions. (2) Also, in order to improve the end-to-end inference time, we propose a new bitwidth-wise DSC (BW-DSC) algorithm to dynamically determine the optimal split point among the potential split points in the mixed-precision quantized DNN architecture. Our evaluation demonstrated that our work provides more effective split points than existing works while mitigating the inference accuracy degradation. Specifically in terms of the end-to-end inference time, our work achieved an average of 16.47% and up to 24.36% improvement compared with a state-of-the-art work.

引用

页码：2538 / 2545

页数：8

共 24 条

[1]

Bakhtiarnia A., 2023, ICASSP, P1

[2] Rethinking Differentiable Search for Mixed-Precision Neural Networks [J].

Cai, Zhaowei ;

Vasconcelos, Nuno .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2346-2355

[3] Deep Learning with Low Precision by Half-wave Gaussian Quantization [J].

Cai, Zhaowei ;

He, Xiaodong ;

Sun, Jian ;

Vasconcelos, Nuno .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5406-5414

[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[5]

Choi J, 2018, Arxiv, DOI arXiv:1805.06085

[6]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[7]

Elthakeb AT, 2020, IEEE MICRO, V40, P37, DOI [10.1109/MM.2020.3009475, 10.1109/mm.2020.3009475]

[8] PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors [J].

Garofalo, Angelo ;

Rusci, Manuele ;

Conti, Francesco ;

Rossi, Davide ;

Benini, Luca .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2164)

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [J].

Jacob, Benoit ;

Kligys, Skirmantas ;

Chen, Bo ;

Zhu, Menglong ;

Tang, Matthew ;

Howard, Andrew ;

Adam, Hartwig ;

Kalenichenko, Dmitry .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2704-2713

← 1 2 3 →