PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud-edge collaborative environments

被引:2
|
作者
Liao, Zhuofan [1 ]
Zhang, Xiangyu [1 ]
He, Shiming [1 ]
Tang, Qiang [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Hunan, Peoples R China
基金
中国国家自然科学基金; 芬兰科学院;
关键词
Edge computing (EC); Deep neural networks (DNNs); Parallel computing; Offloading; Cloud-edge collaboration; DISTRIBUTED INFERENCE;
D O I
10.1016/j.jnca.2023.103720
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To address the challenges of delay-sensitive deep learning tasks, Deep Neural Network (DNN) models are often partitioned and deployed to the cloud-edge environment for parallel and collaborative inference. However, existing parallel coordination mechanisms are not suitable for the cloud-edge environment, as the high inter-layer dependence of DNNs can increase transmission latency and wait times for inference, which contradicts the advantage of low latency in edge computing. To resolve this contradiction, the PMP mechanism takes into account the inter-layer transfer dependence of partitioning solutions and employs a multi-objective equalization algorithm to derive DNN model partitioning strategies suitable for multi-way parallel computing. Moreover, the mechanism establishes a DNN inference time prediction model based on these partitions and utilizes an iterative matching algorithm to approximate an optimal DNN inference workflow. Extensive evaluations of the proposed mechanism are conducted using various DNN models, and the results demonstrate its superiority over existing schemes, including local, CoEdge, and EdgeFlow. Notably, PMP achieves significant reductions in total inference latency compared to these schemes, with reductions of 80.9%, 37.9%, and 9.1%, respectively.
引用
收藏
页数:12
相关论文
共 31 条
  • [11] Collaborative Inference Acceleration Integrating DNN Partitioning and Task Offloading in Mobile Edge Computing
    Xu, Wenxiu
    Yin, Yin
    Chen, Ningjiang
    Tu, Huan
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (11N12) : 1835 - 1863
  • [12] Reliable adaptive edge-cloud collaborative DNN inference acceleration scheme combining computing and communication resources in optical networks
    Yin, Shan
    Jiao, Yurong
    You, Chenyu
    Cai, Mengru
    Jin, Tianyu
    Huang, Shanguo
    JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2023, 15 (10) : 750 - 764
  • [13] AppealNet: An Efficient and Highly-Accurate Edge/Cloud Collaborative Architecture for DNN Inference
    Li, Min
    Li, Yu
    Tian, Ye
    Jiang, Li
    Xu, Qiang
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 409 - 414
  • [14] End-edge collaborative DNN inference acceleration via E-CARGO and RBC
    Peng, Wenying
    Chen, Yanming
    Zhi, Haibin
    Mang, Yiwen
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 760 - 765
  • [15] Energy-Efficient Offloading for DNN-Based Smart IoT Systems in Cloud-Edge Environments
    Chen, Xing
    Zhang, Jianshan
    Lin, Bing
    Chen, Zheyi
    Wolter, Katinka
    Min, Geyong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (03) : 683 - 697
  • [16] Deep Reinforcement Learning Based Cloud-Edge Collaborative Computation Offloading Mechanism
    Chen S.-G.
    Chen J.-M.
    Zhao C.-X.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2021, 49 (01): : 157 - 166
  • [17] A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system
    Lei Shi
    Zhigang Xu
    Yabo Sun
    Yi Shi
    Yuqi Fan
    Xu Ding
    Peer-to-Peer Networking and Applications, 2021, 14 : 4031 - 4045
  • [18] A DNN inference acceleration algorithm combining model partition and task allocation in heterogeneous edge computing system
    Shi, Lei
    Xu, Zhigang
    Sun, Yabo
    Shi, Yi
    Fan, Yuqi
    Ding, Xu
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2021, 14 (06) : 4031 - 4045
  • [19] DDPQN: An Efficient DNN Offloading Strategy in Local-Edge-Cloud Collaborative Environments
    Xue, Min
    Wu, Huaming
    Peng, Guang
    Wolter, Katinka
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (02) : 640 - 655
  • [20] Flexible Supervision System: A Fast Fault-Tolerance Strategy for Cloud Applications in Cloud-Edge Collaborative Environments
    Cai, Weilin
    Chen, Heng
    Zhuo, Zhimin
    Wang, Ziheng
    An, Ninggang
    NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 108 - 113