Ace-Sniper: Cloud-Edge Collaborative Scheduling Framework With DNN Inference Latency Modeling on Heterogeneous Devices

被引：0

作者：

Liu, Weihong ^{[1
,2
]}

Geng, Jiawei ^{[1
,2
]}

Zhu, Zongwei ^{[2
,3
]}

Zhao, Yang ^{[4
]}

Ji, Cheng ^{[5
]}

Li, Changlong ^{[6
]}

Lian, Zirui ^{[1
,2
]}

Zhou, Xuehai ^{[1
,2
]}

机构：

[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230026, Peoples R China

[2] Univ Sci & Technol China, Suzhou Inst Adv Res, Suzhou 215123, Peoples R China

[3] Univ Sci & Technol China, Sch Software Engn, Hefei 230026, Peoples R China

[4] Xian Inst Space Radio Technol, Natl Key Lab Sci & Technol Space Micrwave, Xian 710199, Peoples R China

[5] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[6] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200063, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2024年 / 43卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Cloud-edge collaborative; hardware resource modeling (HRM); heterogeneous platform; inference latency modeling;

D O I：

10.1109/TCAD.2023.3314388

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The cloud-edge collaborative inference requires efficient scheduling of artificial intelligence (AI) tasks to the appropriate edge intelligence devices. Gls DNN inference latency has become a vital basis for improving scheduling efficiency. However, edge devices exhibit highly heterogeneous due to the differences in hardware architectures, computing power, etc. Meanwhile, the diverse deep neural networks (DNNs) are continuing to iterate over time. The diversity of devices and DNNs introduces high computational costs for measurement methods, while invasive prediction methods face significant development efforts and application limitations. In this article, we propose and develop Ace-Sniper, a scheduling framework with DNN inference latency modeling on heterogeneous devices. First, to address the device heterogeneity, a unified hardware resource modeling (HRM) is designed by considering the platforms as black-box functions that output feature vectors. Second, neural network similarity (NNS) is introduced for feature extraction of diverse and frequently iterated DNNs. Finally, with the results of HRM and NNS as input, the performance characterization network is designed to predict the latencies of the given unseen DNNs on heterogeneous devices, which can be combined into most time-based scheduling algorithms. Experimental results show that the average relative error of DNN inference latency prediction is 11.11%, and the prediction accuracy reaches 93.2%. Compared with the nontime-aware scheduling methods, the average waiting time for tasks is reduced by 82.95%, and the platform throughput is improved by 63% on average.

引用

页码：534 / 547

页数：14

共 10 条

[1] Sniper: Cloud-Edge Collaborative Inference Scheduling with Neural Network Similarity Modeling
Liu, Weihong
Geng, Jiawei
Zhu, Zongwei
Cao, Jing
Lian, Zirui
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 505 - 510
[2] Collaborative Cloud-Edge Service Cognition Framework for DNN Configuration Toward Smart IIoT
Xiao, Wenjing
Miao, Yiming
Fortino, Giancarlo
Wu, Di
Chen, Min
Hwang, Kai
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 7038 - 7047
[3] A Cloud-Edge Collaborative Framework for Adaptive Quality Prediction Modeling in IIoT
Yuan, Xiaofeng
Wang, Yichen
Wang, Kai
Ye, Lingjian
Shen, Feifan
Wang, Yalin
Yang, Chunhua
Gui, Weihua
IEEE SENSORS JOURNAL, 2024, 24 (20) : 33656 - 33668
[4] PMP: A partition-match parallel mechanism for DNN inference acceleration in cloud-edge collaborative environments
Liao, Zhuofan
Zhang, Xiangyu
He, Shiming
Tang, Qiang
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2023, 218
[5] SecoInfer: Secure DNN End-Edge Collaborative Inference Framework Optimizing Privacy and Latency
Yao, Yunhao
Hou, Jiahui
Wu, Guangyu
Cheng, Yihang
Yuan, Mu
Luo, Puhan
Wang, Zhiqiang
Li, Xiang-Yang
ACM TRANSACTIONS ON SENSOR NETWORKS, 2024, 20 (06)
[6] An adaptive DNN inference acceleration framework with end-edge-cloud collaborative computing
Liu, Guozhi
Dai, Fei
Xu, Xiaolong
Fu, Xiaodong
Dou, Wanchun
Kumar, Neeraj
Bilal, Muhammad
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 140 : 422 - 435
[7] A Co-Scheduling Framework for DNN Models on Mobile and Edge Devices With Heterogeneous Hardware
Xu, Zhiyuan
Yang, Dejun
Yin, Chengxiang
Tang, Jian
Wang, Yanzhi
Xue, Guoliang
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (03) : 1275 - 1288
[8] NeiLatS: Neighbor-Aware Latency-Sensitive Application Scheduling in Heterogeneous Cloud-Edge Environment
Li, Huadong
Liu, Hui
Liu, Changyuan
Chen, Aoqi
Niu, Zhaocheng
Du, Junzhao
PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 615 - 624
[9] EKDF: An Ensemble Knowledge Distillation Framework for Robust Collaborative Inference on Heterogeneous Edge Devices
Wu, Shangrui
Li, Yupeng
Xu, Yang
Liu, Qin
Jia, Weijia
Wang, Tian
2023 19TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN 2023, 2023, : 191 - 198
[10] A Heterogeneous Cloud-Edge Collaborative Computing Architecture with Affinity-Based Workflow Scheduling and Resource Allocation for Internet-of-Things Applications
Lyu, Shuyu
Dai, Xinfa
Ma, Zhong
Zhou, Ying
Liu, Xing
Gao, Yi
Hu, Zhekun
MOBILE NETWORKS & APPLICATIONS, 2023, 28 (04): : 1443 - 1459

← 1 →