PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud-edge collaborative environments

被引:2
|
作者
Liao, Zhuofan [1 ]
Zhang, Xiangyu [1 ]
He, Shiming [1 ]
Tang, Qiang [1 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Hunan, Peoples R China
基金
中国国家自然科学基金; 芬兰科学院;
关键词
Edge computing (EC); Deep neural networks (DNNs); Parallel computing; Offloading; Cloud-edge collaboration; DISTRIBUTED INFERENCE;
D O I
10.1016/j.jnca.2023.103720
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To address the challenges of delay-sensitive deep learning tasks, Deep Neural Network (DNN) models are often partitioned and deployed to the cloud-edge environment for parallel and collaborative inference. However, existing parallel coordination mechanisms are not suitable for the cloud-edge environment, as the high inter-layer dependence of DNNs can increase transmission latency and wait times for inference, which contradicts the advantage of low latency in edge computing. To resolve this contradiction, the PMP mechanism takes into account the inter-layer transfer dependence of partitioning solutions and employs a multi-objective equalization algorithm to derive DNN model partitioning strategies suitable for multi-way parallel computing. Moreover, the mechanism establishes a DNN inference time prediction model based on these partitions and utilizes an iterative matching algorithm to approximate an optimal DNN inference workflow. Extensive evaluations of the proposed mechanism are conducted using various DNN models, and the results demonstrate its superiority over existing schemes, including local, CoEdge, and EdgeFlow. Notably, PMP achieves significant reductions in total inference latency compared to these schemes, with reductions of 80.9%, 37.9%, and 9.1%, respectively.
引用
收藏
页数:12
相关论文
共 31 条
  • [1] EosDNN: An Efficient Offloading Scheme for DNN Inference Acceleration in Local-Edge-Cloud Collaborative Environments
    Xue, Min
    Wu, Huaming
    Li, Ruidong
    Xu, Minxian
    Jiao, Pengfei
    IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2022, 6 (01): : 248 - 264
  • [2] Cloud-Edge Collaborative Inference with Network Pruning
    Li, Mingran
    Zhang, Xuejun
    Guo, Jiasheng
    Li, Feng
    ELECTRONICS, 2023, 12 (17)
  • [3] An adaptive DNN inference acceleration framework with end-edge-cloud collaborative computing
    Liu, Guozhi
    Dai, Fei
    Xu, Xiaolong
    Fu, Xiaodong
    Dou, Wanchun
    Kumar, Neeraj
    Bilal, Muhammad
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 140 : 422 - 435
  • [4] Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices
    Liu, Weihong
    Geng, Jiawei
    Zhu, Zongwei
    Zhao, Yang
    Ji, Cheng
    Li, Changlong
    Lian, Zirui
    Zhou, Xuehai
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (02) : 534 - 547
  • [5] MEDIA: An Incremental DNN Based Computation Offloading for Collaborative Cloud-Edge Computing
    Zhao, Liang
    Han, Yingcan
    Hawbani, Ammar
    Wan, Shaohua
    Guo, Zhenzhou
    Guizani, Mohsen
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 1986 - 1998
  • [6] Collaborative Cloud-Edge Service Cognition Framework for DNN Configuration Toward Smart IIoT
    Xiao, Wenjing
    Miao, Yiming
    Fortino, Giancarlo
    Wu, Di
    Chen, Min
    Hwang, Kai
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 7038 - 7047
  • [7] DNN Real-Time Collaborative Inference Acceleration with Mobile Edge Computing
    Yang, Run
    Li, Yan
    He, Hui
    Zhang, Weizhe
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [8] A bidirectional DNN partition mechanism for efficient pipeline parallel training in cloud
    Lingyun Cui
    Zhihao Qu
    Guomin Zhang
    Bin Tang
    Baoliu Ye
    Journal of Cloud Computing, 12
  • [9] A bidirectional DNN partition mechanism for efficient pipeline parallel training in cloud
    Cui, Lingyun
    Qu, Zhihao
    Zhang, Guomin
    Tang, Bin
    Ye, Baoliu
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [10] Sniper: Cloud-Edge Collaborative Inference Scheduling with Neural Network Similarity Modeling
    Liu, Weihong
    Geng, Jiawei
    Zhu, Zongwei
    Cao, Jing
    Lian, Zirui
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 505 - 510