Auto-Split: A General Framework of Collaborative Edge-Cloud AI

被引:47
作者
Banitalebi-Dehkordi, Amin [1 ]
Vedula, Naveen [1 ]
Pei, Jian [2 ]
Xia, Fei [3 ]
Wang, Lanjun [1 ]
Zhang, Yong [1 ]
机构
[1] Huawei Technol Canada Co Ltd, Vancouver, BC, Canada
[2] Simon Fraser Univ, Sch Comp Sci, Vancouver, BC, Canada
[3] Huawei Technol, Shenzhen, Peoples R China
来源
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2021年
关键词
Edge-Cloud Collaboration; Network Splitting; Neural Networks; Mixed Precision; Collaborative Intelligence; Distributed Inference;
D O I
10.1145/3447548.3467078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many industry scale applications, large and resource consuming machine learning models reside in powerful cloud servers. At the same time, large amounts of input data are collected at the edge of cloud. The inference results are also communicated to users or passed to downstream tasks at the edge. The edge often consists of a large number of low-power devices. It is a big challenge to design industry products to support sophisticated deep model deployment and conduct model inference in an efficient manner so that the model accuracy remains high and the end-to-end latency is kept low. This paper describes the techniques and engineering practice behind AUTO-SPLIT, an edge-cloud collaborative prototype of Huawei Cloud. This patented technology is already validated on selected applications, is on its way for broader systematic edge cloud application integration, and is being made available for public use as an automated pipeline service for end-to-end cloud-edge collaborative intelligence deployment. To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
引用
收藏
页码:2543 / 2553
页数:11
相关论文
共 51 条
  • [1] [Anonymous], 1988, IEEE T ACOUSTICS SPE
  • [2] [Anonymous], 2017, ICCV
  • [3] [Anonymous], 2019, ISPASS
  • [4] [Anonymous], 2019, NEURIPS
  • [5] [Anonymous], 2021, MODELARTS DEPLOYING
  • [6] Banner R., 2018, ARXIVABS181005723
  • [7] Brown T., 2020, ADV NEURAL INF PROCE, V33, P1877
  • [8] Cai D, 2020, IEEE DECIS CONTR P, P3166, DOI [10.1109/cdc42340.2020.9304064, 10.1109/CDC42340.2020.9304064]
  • [9] USING DATAFLOW TO OPTIMIZE ENERGY EFFICIENCY OF DEEP NEURAL NETWORK ACCELERATORS
    Chen, Yu-Hsin
    Emer, Joel
    Sze, Vivienne
    [J]. IEEE MICRO, 2017, 37 (03) : 12 - 21
  • [10] Cheng Yu, 2017, SURVEY MODEL COMPRES