Two-Phase Split Computing Framework in Edge-Cloud Continuum

被引:1
作者
Ko, Haneul [1 ]
Kim, Bokyeong [1 ]
Kim, Yumi [1 ]
Pack, Sangheon [2 ]
机构
[1] Kyung Hee Univ, Dept Elect & Informat Convergence Engn, Yongin 17104, Gyeonggi, South Korea
[2] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
来源
IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 12期
基金
新加坡国家研究基金会;
关键词
Cloud computing; Computational modeling; Mobile handsets; Internet of Things; Artificial neural networks; Performance evaluation; Optimization; Deep neural network (DNN); inference latency; interlayer splitting; intralayer splitting; two-phase split computing; ASSISTED FULL-DUPLEX; MASSIVE-MIMO; SPECTRAL EFFICIENCY; NETWORK; THROUGHPUT; PILOT;
D O I
10.1109/JIOT.2024.3376977
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Split computing is a promising approach to reduce the inference latency of deep neural network (DNN) models. In this article, we propose a two-phase split computing framework (TSCF). In TSCF, for vertical interlayer splitting between the computing nodes at different levels (e.g., central and edge clouds), a shortest path problem in a directed graph is formulated and a pruning-based low-complexity solution is devised. In addition, for horizontal intralayer splitting between the computing nodes at the same level (e.g., edge clouds), the execution units of a specific layer are further divided and distributed to the computing nodes at the same level proportionally to their available resources. The evaluation results demonstrate that TSCF can reduce inference latency more than 38.8% compared to the traditional interlayer splitting scheme by efficiently using the resources of distributed computing nodes. In addition, it is demonstrated that near-optimal performance in terms of inference latency can be achieved even with a pruning-based low-complexity solution.
引用
收藏
页码:21741 / 21749
页数:9
相关论文
共 28 条
  • [1] Bakhtiarnia A., 2023, P ICASSP, P1
  • [2] Sirius: A Flat Datacenter Network with Nanosecond Optical Switching
    Ballani, Hitesh
    Costa, Paolo
    Behrendt, Raphael
    Cletheroe, Daniel
    Haller, Istvan
    Jozwik, Krzysztof
    Karinou, Fotini
    Lange, Sophie
    Shi, Kai
    Thomsen, Benn
    Williams, Hugh
    [J]. SIGCOMM '20: PROCEEDINGS OF THE 2020 ANNUAL CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION ON THE APPLICATIONS, TECHNOLOGIES, ARCHITECTURES, AND PROTOCOLS FOR COMPUTER COMMUNICATION, 2020, : 782 - 797
  • [3] Optimal Task Allocation for Time-Varying Edge Computing Systems with Split DNNs
    Callegaro, Davide
    Matsubara, Yoshitomo
    Levorato, Marco
    [J]. 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [4] Case J., 1990, A Simple Network Management Protocol (SNMP)
  • [5] Efficient Multi-User Computation Offloading for Mobile-Edge Cloud Computing
    Chen, Xu
    Jiao, Lei
    Li, Wenzhong
    Fu, Xiaoming
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (05) : 2827 - 2840
  • [6] Datta P, 2022, Arxiv, DOI arXiv:2208.11596
  • [7] Dong Z., 2022, PROC EUR WIRELES, P1
  • [8] flozz.github, 2024, PyPAPI
  • [9] Distributed Inference with Deep Learning Models across Heterogeneous Edge Devices
    Hu, Chenghao
    Li, Baochun
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 330 - 339
  • [10] Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge
    Kang, Yiping
    Hauswald, Johann
    Gao, Cao
    Rovinski, Austin
    Mudge, Trevor
    Mars, Jason
    Tang, Lingjia
    [J]. ACM SIGPLAN NOTICES, 2017, 52 (04) : 615 - 629