Dynamic Pricing and Placing for Distributed Machine Learning Jobs: An Online Learning Approach

被引：3

作者：

Zhou, Ruiting ^{[1
]}

Zhang, Xueying ^{[2
]}

Lui, John C. S. ^{[3
]}

Li, Zongpeng ^{[4
]}

机构：

[1] Southeast Univ, Sch Comp Sci Engn, Nanjing 211189, Peoples R China

[2] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430072, Peoples R China

[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Tsinghua Univ, Inst Network Sci & Cyberspace, Beijing 100084, Peoples R China

来源：

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS | 2023年 / 41卷 / 04期

关键词：

Pricing; Runtime; Cloud computing; Servers; Heuristic algorithms; Dynamic scheduling; Costs; Machine learning; dynamic pricing; online placement; RESOURCE-ALLOCATION; OPTIMIZATION;

D O I：

10.1109/JSAC.2023.3242707

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Nowadays distributed machine learning (ML) jobs usually adopt a parameter server (PS) framework to train models over large-scale datasets. Such ML job deploys hundreds of concurrent workers, and model parameter updates are exchanged frequently between workers and PSs. Current practice is that workers and PSs may be placed on different physical servers, bringing uncertainty in jobs' runtime. Existing cloud pricing policy often charges a fixed price according to the job's runtime. Although this pricing strategy is simple to implement, such pricing mechanism is not suitable for distributed ML jobs whose runtime is stochastic and can only be estimated according to its placement after job admission. To supplement existing cloud pricing schemes, we design a dynamic pricing and placement algorithm, DPS, for distributed ML jobs. DPS aims to maximize the cloud service provider's profit, which dynamically calculates unit resource price upon a job's arrival, and determines job's placement to minimize its runtime if offered price is accepted to users. Our design exploits the multi-armed bandit (MAB) technique to learn unknown information based on past sales. DPS balances the exploration and exploitation stage, and selects the best price based on the reward which is related to job runtime. Our learning-based algorithm can increase the provider's profit by 200%, and achieves a sub-linear regret with both the time horizon and the total job number, compared to benchmark pricing schemes. Extensive evaluations using real-world data also validates the efficacy of DPS.

引用

页码：1135 / 1150

页数：16

共 50 条

[31] Machine Learning Feature Based Job Scheduling for Distributed Machine Learning Clusters
Wang, Haoyu
Liu, Zetian
Shen, Haiying
IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (01) : 58 - 73
[32] A Survey on Machine Learning for Geo-Distributed Cloud Data Center Managements
Hogade, Ninad
Pasricha, Sudeep
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2023, 8 (01): : 15 - 31
[33] Distributed Online Learning for Leaderless Multicluster Games in Dynamic Environments
Yu, Rui
Meng, Min
Li, Li
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2024, 11 (03): : 1548 - 1561
[34] Online Testing in Machine Learning Approach for Fall Detection
Martinez-Villasenor, Lourdes
Ponce, Hiram
Nunez-Martinez, Jose
Pacheco, Sofia
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[35] Dynamic Pricing and Learning with Discounting
Feng, Zhichao
Dawande, Milind
Janakiraman, Ganesh
Qi, Anyan
OPERATIONS RESEARCH, 2024, 72 (02) : 481 - 492
[36] Dynamic pricing and inventory management with demand learning: A bayesian approach
Liu, Jue
Pang, Zhan
Qi, Linggang
COMPUTERS & OPERATIONS RESEARCH, 2020, 124
[37] Learning Curve: A Simulation-Based Approach to Dynamic Pricing
Joan Morris DiMicco
Pattie Maes
Amy Greenwald
Electronic Commerce Research, 2003, 3 (3-4) : 245 - 276
[38] An Online Learning Algorithm for Distributed Task Offloading in Multi-Access Edge Computing
Sun, Zhenfeng
Nakhai, Mohammad Reza
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 (68) : 3090 - 3102
[39] A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment
Ataie, Ehsan
Gianniti, Eugenio
Ardagna, Danilo
Movaghar, Ali
PROCEEDINGS OF 2016 18TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 431 - 439
[40] Caching in Dynamic Environments: A Near-Optimal Online Learning Approach
Zhou, Shiji
Wang, Zhi
Hu, Chenghao
Mao, Yinan
Yan, Haopeng
Zhang, Shanghang
Wu, Chuan
Zhu, Wenwu
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 792 - 804

← 1 2 3 4 5 →