DNN Split Computing: Quantization and Run-length Coding are Enough

被引:0
作者
Carra, Damiano [1 ]
Neglia, Giovanni [2 ]
机构
[1] Univ Verona, Verona, Italy
[2] Univ Cote Azur, Inria, Nice, France
来源
IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM | 2023年
关键词
D O I
10.1109/GLOBECOM54140.2023.10437445
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Split computing, a recently developed paradigm, capitalizes on the computational resources of end devices to enhance the inference efficiency in machine learning (ML) applications. This approach involves the end device processing input data and transmitting intermediate results to a cloud server, which then completes the inference computation. While the main goals of split computing are to reduce latency, minimize energy consumption, and decrease data transfer overhead, minimizing data transmission time remains a challenge. Many existing strategies involve modifying the ML model architecture which ultimately requires resource-intensive retraining. In our work, we explore lossless and lossy techniques to encode intermediate results without modifying the ML model. Concentrating on image classification and object detectiontwo prevalent ML applications-we assess the advantages and limitations of each technique. Our findings indicate that simple tools, such as linear quantization and run-length encoding, already accomplish considerable information reduction, which is on par with more complex state-of-the-art techniques that necessitate model retraining. These tools are computationally efficient and do not burden the end device.
引用
收藏
页码:7357 / 7362
页数:6
相关论文
共 23 条
  • [1] Adebayo J, 2018, ADV NEUR IN, V31
  • [2] Pareto-Optimal Bit Allocation for Collaborative Intelligence
    Alvar, Saeed Ranjbar
    Bajic, Ivan V.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3348 - 3361
  • [3] [Anonymous], IEEE T MOBILE COMPUT
  • [4] [Anonymous], 2020, STUD INT SOC CULT H
  • [5] Castellano G., 2022, ITC 2022 34 INT TEL
  • [6] Choi H, 2018, IEEE IMAGE PROC, P3743, DOI 10.1109/ICIP.2018.8451100
  • [7] I-SPLIT: Deep Network Interpretability for Split Computing
    Cunico, Federico
    Capogrosso, Luigi
    Setti, Francesco
    Carra, Damiano
    Fummi, Franco
    Cristani, Marco
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2575 - 2581
  • [8] Human and DNN Classification Performance on Images With Quality Distortions: A Comparative Study
    Dodge, Samuel
    Karam, Lina
    [J]. ACM TRANSACTIONS ON APPLIED PERCEPTION, 2019, 16 (02)
  • [9] JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services
    Eshratifar, Amir Erfan
    Abrishami, Mohammad Saeed
    Pedram, Massoud
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (02) : 565 - 576
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778