DNN Split Computing: Quantization and Run-length Coding are Enough

被引:1
作者
Carra, Damiano [1 ]
Neglia, Giovanni [2 ]
机构
[1] Univ Verona, Verona, Italy
[2] Univ Cote Azur, Inria, Nice, France
来源
IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM | 2023年
关键词
D O I
10.1109/GLOBECOM54140.2023.10437445
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Split computing, a recently developed paradigm, capitalizes on the computational resources of end devices to enhance the inference efficiency in machine learning (ML) applications. This approach involves the end device processing input data and transmitting intermediate results to a cloud server, which then completes the inference computation. While the main goals of split computing are to reduce latency, minimize energy consumption, and decrease data transfer overhead, minimizing data transmission time remains a challenge. Many existing strategies involve modifying the ML model architecture which ultimately requires resource-intensive retraining. In our work, we explore lossless and lossy techniques to encode intermediate results without modifying the ML model. Concentrating on image classification and object detectiontwo prevalent ML applications-we assess the advantages and limitations of each technique. Our findings indicate that simple tools, such as linear quantization and run-length encoding, already accomplish considerable information reduction, which is on par with more complex state-of-the-art techniques that necessitate model retraining. These tools are computationally efficient and do not burden the end device.
引用
收藏
页码:7357 / 7362
页数:6
相关论文
共 23 条
[1]  
Adebayo J, 2018, ADV NEUR IN, V31
[2]   Pareto-Optimal Bit Allocation for Collaborative Intelligence [J].
Alvar, Saeed Ranjbar ;
Bajic, Ivan V. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3348-3361
[3]  
[Anonymous], IEEE T MOBILE COMPUT
[4]  
[Anonymous], 2020, STUD INT SOC CULT H
[5]  
Castellano G., 2022, ITC 2022 34 INT TEL
[6]  
Choi H, 2018, IEEE IMAGE PROC, P3743, DOI 10.1109/ICIP.2018.8451100
[7]   I-SPLIT: Deep Network Interpretability for Split Computing [J].
Cunico, Federico ;
Capogrosso, Luigi ;
Setti, Francesco ;
Carra, Damiano ;
Fummi, Franco ;
Cristani, Marco .
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, :2575-2581
[8]   Human and DNN Classification Performance on Images With Quality Distortions: A Comparative Study [J].
Dodge, Samuel ;
Karam, Lina .
ACM TRANSACTIONS ON APPLIED PERCEPTION, 2019, 16 (02)
[9]   JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services [J].
Eshratifar, Amir Erfan ;
Abrishami, Mohammad Saeed ;
Pedram, Massoud .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (02) :565-576
[10]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778