PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud-edge collaborative environments

被引:2
|
作者
Liao, Zhuofan [1 ]
Zhang, Xiangyu [1 ]
He, Shiming [1 ]
Tang, Qiang [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Hunan, Peoples R China
基金
中国国家自然科学基金; 芬兰科学院;
关键词
Edge computing (EC); Deep neural networks (DNNs); Parallel computing; Offloading; Cloud-edge collaboration; DISTRIBUTED INFERENCE;
D O I
10.1016/j.jnca.2023.103720
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To address the challenges of delay-sensitive deep learning tasks, Deep Neural Network (DNN) models are often partitioned and deployed to the cloud-edge environment for parallel and collaborative inference. However, existing parallel coordination mechanisms are not suitable for the cloud-edge environment, as the high inter-layer dependence of DNNs can increase transmission latency and wait times for inference, which contradicts the advantage of low latency in edge computing. To resolve this contradiction, the PMP mechanism takes into account the inter-layer transfer dependence of partitioning solutions and employs a multi-objective equalization algorithm to derive DNN model partitioning strategies suitable for multi-way parallel computing. Moreover, the mechanism establishes a DNN inference time prediction model based on these partitions and utilizes an iterative matching algorithm to approximate an optimal DNN inference workflow. Extensive evaluations of the proposed mechanism are conducted using various DNN models, and the results demonstrate its superiority over existing schemes, including local, CoEdge, and EdgeFlow. Notably, PMP achieves significant reductions in total inference latency compared to these schemes, with reductions of 80.9%, 37.9%, and 9.1%, respectively.
引用
收藏
页数:12
相关论文
共 31 条
  • [21] Multi-Compression Scale DNN Inference Acceleration based on Cloud-Edge-End Collaboration
    Qi, Huamei
    Ren, Fang
    Wang, Leilei
    Jiang, Ping
    Wan, Shaohua
    Deng, Xiaoheng
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2024, 23 (01)
  • [22] A new mechanism for reef coral monitoring based on underwater cloud-edge collaborative architecture
    Jin Z.
    Duan C.
    Yang Q.
    Su Y.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2022, 44 (12): : 3829 - 3836
  • [23] New mechanism for underwater monitoring in a software-defined cloud-edge collaborative architecture
    Jin, Zhigang
    Hong, Ye
    Su, Yishan
    Yang, Qiuling
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2024, 46 (03): : 1101 - 1108
  • [24] Joint DNN Partition and Resource Allocation for Task Offloading in Edge-Cloud-Assisted IoT Environments
    Fan, Wenhao
    Gao, Li
    Su, Yi
    Wu, Fan
    Liu, Yuan'an
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (12) : 10146 - 10159
  • [25] LFDC: Low-Energy Federated Deep Reinforcement Learning for Caching Mechanism in Cloud-Edge Collaborative
    Zhang, Xinyu
    Hu, Zhigang
    Zheng, Meiguang
    Liang, Yang
    Xiao, Hui
    Zheng, Hao
    Xu, Aikun
    APPLIED SCIENCES-BASEL, 2023, 13 (10):
  • [26] CNN PC: End-Edge-Cloud Collaborative CNN Inference With Joint Model Partition and Compression
    Yang, Shusen
    Zhang, Zhanhua
    Zhao, Cong
    Song, Xin
    Guo, Siyan
    Li, Hailiang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4039 - 4056
  • [27] Collaborative DNNs Inference with Joint Model Partition and Compression in Mobile Edge-Cloud Computing Networks
    Tang, Yaxin
    Li, Xiuhua
    Li, Hui
    Yang, Zhengyi
    Wang, Xiaofei
    Leung, Victor C. M.
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [28] Task Partition-Based Caching Optimization for Delay-Sensitive Content Distribution in Cloud-Edge Cooperation Environments
    Qin, Xiaolin
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [29] Real-Time Offloading for Dependent and Parallel Tasks in Cloud-Edge Environments Using Deep Reinforcement Learning
    Chen, Xing
    Hu, Shengxi
    Yu, Chujia
    Chen, Zheyi
    Min, Geyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (03) : 391 - 404
  • [30] A Splittable DNN-Based Object Detector for Edge-Cloud Collaborative Real-Time Video Inference
    Lee, Joo Chan
    Kim, Yongwoo
    Moon, SungTae
    Ko, Jong Hwan
    2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,