Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices

被引:0
|
作者
Liu, Weihong [1 ,2 ]
Geng, Jiawei [1 ,2 ]
Zhu, Zongwei [2 ,3 ]
Zhao, Yang [4 ]
Ji, Cheng [5 ]
Li, Changlong [6 ]
Lian, Zirui [1 ,2 ]
Zhou, Xuehai [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230026, Peoples R China
[2] Univ Sci & Technol China, Suzhou Inst Adv Res, Suzhou 215123, Peoples R China
[3] Univ Sci & Technol China, Sch Software Engn, Hefei 230026, Peoples R China
[4] Xian Inst Space Radio Technol, Natl Key Lab Sci & Technol Space Micrwave, Xian 710199, Peoples R China
[5] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[6] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200063, Peoples R China
基金
中国国家自然科学基金;
关键词
Cloud-edge collaborative; hardware resource modeling (HRM); heterogeneous platform; inference latency modeling;
D O I
10.1109/TCAD.2023.3314388
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The cloud-edge collaborative inference requires efficient scheduling of artificial intelligence (AI) tasks to the appropriate edge intelligence devices. Gls DNN inference latency has become a vital basis for improving scheduling efficiency. However, edge devices exhibit highly heterogeneous due to the differences in hardware architectures, computing power, etc. Meanwhile, the diverse deep neural networks (DNNs) are continuing to iterate over time. The diversity of devices and DNNs introduces high computational costs for measurement methods, while invasive prediction methods face significant development efforts and application limitations. In this article, we propose and develop Ace-Sniper, a scheduling framework with DNN inference latency modeling on heterogeneous devices. First, to address the device heterogeneity, a unified hardware resource modeling (HRM) is designed by considering the platforms as black-box functions that output feature vectors. Second, neural network similarity (NNS) is introduced for feature extraction of diverse and frequently iterated DNNs. Finally, with the results of HRM and NNS as input, the performance characterization network is designed to predict the latencies of the given unseen DNNs on heterogeneous devices, which can be combined into most time-based scheduling algorithms. Experimental results show that the average relative error of DNN inference latency prediction is 11.11%, and the prediction accuracy reaches 93.2%. Compared with the nontime-aware scheduling methods, the average waiting time for tasks is reduced by 82.95%, and the platform throughput is improved by 63% on average.
引用
收藏
页码:534 / 547
页数:14
相关论文
共 10 条
  • [1] Sniper: Cloud-Edge Collaborative Inference Scheduling with Neural Network Similarity Modeling
    Liu, Weihong
    Geng, Jiawei
    Zhu, Zongwei
    Cao, Jing
    Lian, Zirui
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 505 - 510
  • [2] Collaborative Cloud-Edge Service Cognition Framework for DNN Configuration Toward Smart IIoT
    Xiao, Wenjing
    Miao, Yiming
    Fortino, Giancarlo
    Wu, Di
    Chen, Min
    Hwang, Kai
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 7038 - 7047
  • [3] A Cloud-Edge Collaborative Framework for Adaptive Quality Prediction Modeling in IIoT
    Yuan, Xiaofeng
    Wang, Yichen
    Wang, Kai
    Ye, Lingjian
    Shen, Feifan
    Wang, Yalin
    Yang, Chunhua
    Gui, Weihua
    IEEE SENSORS JOURNAL, 2024, 24 (20) : 33656 - 33668
  • [4] PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud-edge collaborative environments
    Liao, Zhuofan
    Zhang, Xiangyu
    He, Shiming
    Tang, Qiang
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2023, 218
  • [5] SecoInfer: Secure DNN End-Edge Collaborative Inference Framework Optimizing Privacy and Latency
    Yao, Yunhao
    Hou, Jiahui
    Wu, Guangyu
    Cheng, Yihang
    Yuan, Mu
    Luo, Puhan
    Wang, Zhiqiang
    Li, Xiang-Yang
    ACM TRANSACTIONS ON SENSOR NETWORKS, 2024, 20 (06)
  • [6] An adaptive DNN inference acceleration framework with end-edge-cloud collaborative computing
    Liu, Guozhi
    Dai, Fei
    Xu, Xiaolong
    Fu, Xiaodong
    Dou, Wanchun
    Kumar, Neeraj
    Bilal, Muhammad
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 140 : 422 - 435
  • [7] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
    Xu, Zhiyuan
    Yang, Dejun
    Yin, Chengxiang
    Tang, Jian
    Wang, Yanzhi
    Xue, Guoliang
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
  • [8] NeiLatS: Neighbor-Aware Latency-Sensitive Application Scheduling in Heterogeneous Cloud-Edge Environment
    Li, Huadong
    Liu, Hui
    Liu, Changyuan
    Chen, Aoqi
    Niu, Zhaocheng
    Du, Junzhao
    PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 615 - 624
  • [9] EKDF: An Ensemble Knowledge Distillation Framework for Robust Collaborative Inference on Heterogeneous Edge Devices
    Wu, Shangrui
    Li, Yupeng
    Xu, Yang
    Liu, Qin
    Jia, Weijia
    Wang, Tian
    2023 19TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN 2023, 2023, : 191 - 198
  • [10] A Heterogeneous Cloud-Edge Collaborative Computing Architecture with Affinity-Based Workflow Scheduling and Resource Allocation for Internet-of-Things Applications
    Lyu, Shuyu
    Dai, Xinfa
    Ma, Zhong
    Zhou, Ying
    Liu, Xing
    Gao, Yi
    Hu, Zhekun
    MOBILE NETWORKS & APPLICATIONS, 2023, 28 (04): : 1443 - 1459