Edge/Cloud Infinite-Time Horizon Resource Allocation for Distributed Machine Learning and General Tasks

被引:2
作者
Sartzetakis, Ippokratis [1 ,2 ]
Soumplis, Polyzois [1 ,2 ]
Pantazopoulos, Panagiotis [2 ]
Katsaros, Konstantinos V. [2 ]
Sourlas, Vasilis [2 ]
Varvarigos, Emmanouel [1 ,2 ]
机构
[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens 15773, Greece
[2] Natl Tech Univ Athens, Inst Commun & Comp Syst, Athens 15773, Greece
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2024年 / 21卷 / 01期
基金
欧盟地平线“2020”;
关键词
Cloud and edge computing; distributed computing; distributed machine learning; inference; training; resource allocation; INTERNET; IOT;
D O I
10.1109/TNSM.2023.3312593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Edge computing has emerged as a computing paradigm where the application and data processing takes place close to the end devices. It decreases the distances over which data transfers are made, offering reduced delay and fast speed of action for general data processing and store/retrieve jobs. The benefits of edge computing can also be reaped for distributed computation algorithms, where the cloud also plays an assistive role. In this context, an important challenge is to allocate the required resources at both edge and cloud to carry out the processing of data that are generated over a continuous ("infinite") time horizon. This is a complex problem due to the variety of requirements (resource needs, accuracy, delay, etc.) that may be posed by each computation algorithm, as well as the heterogeneous resources' features (e.g., processing, bandwidth). In this work, we develop a solution for serving weakly coupled general distributed algorithms, with emphasis on machine learning algorithms, at the edge and/or the cloud. We present a dual-objective Integer Linear Programming formulation that optimizes monetary cost and computation accuracy. We also introduce efficient heuristics to perform the resource allocation. We examine various distributed ML allocation scenarios using realistic parameters from actual vendors. We quantify trade-offs related to accuracy, performance and cost of edge/cloud bandwidth and processing resources. Our results indicate that among the many parameters of interest, the processing costs seem to play the most important role for the allocation decisions. Finally, we explore interesting interactions between target accuracy, monetary cost and delay.
引用
收藏
页码:697 / 713
页数:17
相关论文
共 50 条
[21]   DyRAM: Dynamic Data Allocation and Resource Management in Distributed Machine Learning Systems [J].
Tiwari, Vaibhavi ;
Thakkar, Rahul ;
Wang, Jiayin .
2024 IEEE 15TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE, UEMCON, 2024, :119-126
[22]   Q-Learning Algorithm for Joint Computation Offloading and Resource Allocation in Edge Cloud [J].
Dab, Boutheina ;
Aitsaadi, Nadjib ;
Langar, Rami .
2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), 2019,
[23]   Preemptive Scheduling for Distributed Machine Learning Jobs in Edge-Cloud Networks [J].
Wang, Ne ;
Zhou, Ruiting ;
Jiao, Lei ;
Zhang, Renli ;
Li, Bo ;
Li, Zongpeng .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2022, 40 (08) :2411-2425
[24]   Resource Allocation in Flexible-Bandwidth Fine-Grained Optical Transport Networks for Geo-Distributed Machine Learning [J].
Lian, Meng ;
Zhao, Yongli ;
Li, Xin ;
Liu, Wenhong ;
Li, Yajie ;
Tornatore, Massimo ;
Zhang, Jie .
IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (13) :25601-25619
[25]   Security computing resource allocation based on deep reinforcement learning in serverless multi-cloud edge computing [J].
Zhang, Hang ;
Wang, Jinsong ;
Zhang, Hongwei ;
Bu, Chao .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 151 :152-161
[26]   Joint Adaptive Aggregation and Resource Allocation for Hierarchical Federated Learning Systems Based on Edge-Cloud Collaboration [J].
Su, Yi ;
Fan, Wenhao ;
Meng, Qingcheng ;
Chen, Penghui ;
Liu, Yuan'an .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2025, 13 (01) :369-382
[27]   Profit-Maximized Collaborative Computation Offloading and Resource Allocation in Distributed Cloud and Edge Computing Systems [J].
Yuan, Haitao ;
Zhou, MengChu .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 18 (03) :1277-1287
[28]   Dynamic Resource Allocation on the Edge: A Causal and Contextually-Aware Machine Learning Approach [J].
Symvoulidis, Chrysostomos ;
Paraskevoulakou, Efterpi ;
Kiourtis, Athanasios ;
Mavrogiorgou, Argyro ;
Kyriazis, Dimosthenis .
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 4, INTELLISYS 2024, 2024, 1068 :300-313
[29]   Deep Reinforcement Learning Based Resource Allocation Strategy in Cloud-Edge Computing System [J].
Xu, Jianqiao ;
Xu, Zhuohan ;
Shi, Bing .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 10
[30]   Deep Reinforcement Learning Based Resource Allocation Strategy in Cloud-Edge Computing System [J].
Xu, Zhuohan ;
Zhong, Zeheng ;
Shi, Bing .
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,