Dynamic Pricing and Placing for Distributed Machine Learning Jobs: An Online Learning Approach

被引:3
|
作者
Zhou, Ruiting [1 ]
Zhang, Xueying [2 ]
Lui, John C. S. [3 ]
Li, Zongpeng [4 ]
机构
[1] Southeast Univ, Sch Comp Sci Engn, Nanjing 211189, Peoples R China
[2] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430072, Peoples R China
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[4] Tsinghua Univ, Inst Network Sci & Cyberspace, Beijing 100084, Peoples R China
关键词
Pricing; Runtime; Cloud computing; Servers; Heuristic algorithms; Dynamic scheduling; Costs; Machine learning; dynamic pricing; online placement; RESOURCE-ALLOCATION; OPTIMIZATION;
D O I
10.1109/JSAC.2023.3242707
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays distributed machine learning (ML) jobs usually adopt a parameter server (PS) framework to train models over large-scale datasets. Such ML job deploys hundreds of concurrent workers, and model parameter updates are exchanged frequently between workers and PSs. Current practice is that workers and PSs may be placed on different physical servers, bringing uncertainty in jobs' runtime. Existing cloud pricing policy often charges a fixed price according to the job's runtime. Although this pricing strategy is simple to implement, such pricing mechanism is not suitable for distributed ML jobs whose runtime is stochastic and can only be estimated according to its placement after job admission. To supplement existing cloud pricing schemes, we design a dynamic pricing and placement algorithm, DPS, for distributed ML jobs. DPS aims to maximize the cloud service provider's profit, which dynamically calculates unit resource price upon a job's arrival, and determines job's placement to minimize its runtime if offered price is accepted to users. Our design exploits the multi-armed bandit (MAB) technique to learn unknown information based on past sales. DPS balances the exploration and exploitation stage, and selects the best price based on the reward which is related to job runtime. Our learning-based algorithm can increase the provider's profit by 200%, and achieves a sub-linear regret with both the time horizon and the total job number, compared to benchmark pricing schemes. Extensive evaluations using real-world data also validates the efficacy of DPS.
引用
收藏
页码:1135 / 1150
页数:16
相关论文
共 50 条
  • [31] Machine Learning Feature Based Job Scheduling for Distributed Machine Learning Clusters
    Wang, Haoyu
    Liu, Zetian
    Shen, Haiying
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (01) : 58 - 73
  • [32] A Survey on Machine Learning for Geo-Distributed Cloud Data Center Managements
    Hogade, Ninad
    Pasricha, Sudeep
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2023, 8 (01): : 15 - 31
  • [33] Distributed Online Learning for Leaderless Multicluster Games in Dynamic Environments
    Yu, Rui
    Meng, Min
    Li, Li
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2024, 11 (03): : 1548 - 1561
  • [34] Online Testing in Machine Learning Approach for Fall Detection
    Martinez-Villasenor, Lourdes
    Ponce, Hiram
    Nunez-Martinez, Jose
    Pacheco, Sofia
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [35] Dynamic Pricing and Learning with Discounting
    Feng, Zhichao
    Dawande, Milind
    Janakiraman, Ganesh
    Qi, Anyan
    OPERATIONS RESEARCH, 2024, 72 (02) : 481 - 492
  • [36] Dynamic pricing and inventory management with demand learning: A bayesian approach
    Liu, Jue
    Pang, Zhan
    Qi, Linggang
    COMPUTERS & OPERATIONS RESEARCH, 2020, 124
  • [37] Learning Curve: A Simulation-Based Approach to Dynamic Pricing
    Joan Morris DiMicco
    Pattie Maes
    Amy Greenwald
    Electronic Commerce Research, 2003, 3 (3-4) : 245 - 276
  • [38] An Online Learning Algorithm for Distributed Task Offloading in Multi-Access Edge Computing
    Sun, Zhenfeng
    Nakhai, Mohammad Reza
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 (68) : 3090 - 3102
  • [39] A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment
    Ataie, Ehsan
    Gianniti, Eugenio
    Ardagna, Danilo
    Movaghar, Ali
    PROCEEDINGS OF 2016 18TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC), 2016, : 431 - 439
  • [40] Caching in Dynamic Environments: A Near-Optimal Online Learning Approach
    Zhou, Shiji
    Wang, Zhi
    Hu, Chenghao
    Mao, Yinan
    Yan, Haopeng
    Zhang, Shanghang
    Wu, Chuan
    Zhu, Wenwu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 792 - 804