An Online Approach for DNN Model Caching and Processor Allocation in Edge Computing

被引:4
|
作者
Chen, Zhiqi [1 ]
Zhang, Sheng [1 ]
Ma, Zhi [1 ]
Zhang, Shuai [1 ]
Qian, Zhuzhong [1 ]
Xiao, Mingjun [2 ]
Wu, Jie [3 ]
Lu, Sanglu [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Anhui, Peoples R China
[3] Temple Univ, Ctr Networked Comp, Philadelphia, PA 19122 USA
关键词
Edge Computing; DNN Model Caching; Proximity Inferences; Gibbs Sampling; Lyapunov Optimization; SERVICE PLACEMENT;
D O I
10.1109/IWQoS54832.2022.9812874
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Edge computing is a new computing paradigm rising gradually in recent years. Applications, such as object detection, virtual reality and intelligent cameras, often leverage Deep Neural Networks (DNN) inference technology. The traditional paradigm of DNN inference based on cloud suffers from high delay because of the limited bandwidth. From the perspective of service providers, caching DNN models on the edge brings several benefits, such as efficiency, privacy, security, etc.. The problem we concerned in this paper is how to decide the cached models and how to allocate processors of edge servers to reduce the overall system cost. To solve it, we model and study the DNN Model Caching and Processor Allocation (DMCPA) problem, which considers user-perceived delay and energy consumption with limited edge resources. We model it as an integer nonlinear programming (INLP) problem, and prove its NP-Completeness. Since it is considered as a long-term average optimization problem, we leverage the Lyapunov framework to develop a novel online algorithm DMCPA-GS-Online with Gibbs Sampling. We give the theoretical analysis to prove that our algorithm is near-optimal. In experiments, we study the performance of our algorithm and compare it with other baselines. The simulation results with the trace dataset from real world demonstrate the effectiveness and adaptiveness of our algorithm.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Online data caching in edge computing
    Han, Xinxin
    Gao, Guichen
    Wang, Yang
    Ting, Hing-Fung
    You, Ilsun
    Zhang, Yong
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (17):
  • [2] Online Collaborative Data Caching in Edge Computing
    Xia, Xiaoyu
    Chen, Feifei
    He, Qiang
    Grundy, John
    Abdelrazek, Mohamed
    Jin, Hai
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (02) : 281 - 294
  • [3] A Decentralized Collaborative Approach to Online Edge User Allocation in Edge Computing Environments
    Peng, Qinglan
    Xia, Yunni
    Wang, Yan
    Wu, Chunrong
    Zheng, Wanbo
    Luo, Xin
    Panz, Shanchen
    Ma, Yong
    Jiang, Chunxu
    2020 IEEE 13TH INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2020), 2020, : 294 - 301
  • [4] Efficient Online DNN Inference with Continuous Learning in Edge Computing
    Zeng, Yifan
    Zhou, Ruiting
    Jia, Lei
    Han, Ziyi
    Yu, Jieling
    Ma, Yue
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [5] Joint Optimization With DNN Partitioning and Resource Allocation in Mobile Edge Computing
    Dong, Chongwu
    Hu, Sheng
    Chen, Xi
    Wen, Wushao
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2021, 18 (04): : 3973 - 3986
  • [6] Joint Optimization of Computing Resources and Data Allocation for Moble Edge Computing (MEC): An Online Approach
    Shao, Xun
    Hasegawa, Go
    Kamiyama, Noriaki
    Liu, Zhi
    Masui, Hiroshi
    Ji, Yusheng
    2019 28TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2019,
  • [7] Online user allocation in mobile edge computing environments: A decentralized reactive approach
    Wu, Chunrong
    Peng, Qinglan
    Xia, Yunni
    Ma, Yong
    Zheng, Wangbo
    Xie, Hong
    Pang, Shanchen
    Li, Fan
    Fu, Xiaodong
    Li, Xiaobo
    Liu, Wei
    JOURNAL OF SYSTEMS ARCHITECTURE, 2021, 113
  • [8] An online joint optimization approach for task offloading and caching in multi-access edge computing
    Yang, Xuemei
    Luo, Hong
    Sun, Yan
    WIRELESS NETWORKS, 2025, 31 (03) : 2637 - 2651
  • [9] An online joint optimization approach for task offloading and caching in multi-access edge computing
    Xuemei Yang
    Hong Luo
    Yan Sun
    Wireless Networks, 2025, 31 (3) : 2637 - 2651
  • [10] Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing
    Li, Rui
    Ouyang, Tao
    Zeng, Liekang
    Liao, Guocheng
    Zhou, Zhi
    Chen, Xu
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 4414 - 4426