Enhanced Scheduling of AI Applications in Multi-Tenant Cloud Using Genetic Optimizations

被引:1
|
作者
Kwon, Seokmin [1 ]
Bahn, Hyokyung [1 ]
机构
[1] Ewha Womans Univ, Dept Comp Engn, Seoul 03760, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 11期
关键词
task scheduling; artificial intelligence; machine learning; cloud; genetic algorithm;
D O I
10.3390/app14114697
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The artificial intelligence (AI) industry is increasingly integrating with diverse sectors such as smart logistics, FinTech, entertainment, and cloud computing. This expansion has led to the coexistence of heterogeneous applications within multi-tenant systems, presenting significant scheduling challenges. This paper addresses these challenges by exploring the scheduling of various machine learning workloads in large-scale, multi-tenant cloud systems that utilize heterogeneous GPUs. Traditional scheduling strategies often struggle to achieve satisfactory results due to low GPU utilization in these complex environments. To address this issue, we propose a novel scheduling approach that employs a genetic optimization technique, implemented within a process-oriented discrete-event simulation framework, to effectively orchestrate various machine learning tasks. We evaluate our approach using workload traces from Alibaba's MLaaS cluster with over 6000 heterogeneous GPUs. The results show that our scheduling improves GPU utilization by 12.8% compared to Round-Robin scheduling, demonstrating the effectiveness of the solution in optimizing cloud-based GPU scheduling.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Multi-Tenant Cloud Service Composition using Evolutionary Optimization
    Kumar, Satish
    Bahsoon, Rami
    Chen, Tao
    Li, Ke
    Buyya, Rajkumar
    2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), 2018, : 972 - 979
  • [22] Understanding performance interference in multi-tenant cloud databases and web applications
    Xavier, Miguel G.
    Matteussi, Kassiano J.
    Lorenzo, Fabian
    De Rose, Cesar A. F.
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 2847 - 2852
  • [23] Analyzing Multi-Tenant Cloud Services' Accountability
    Masmoudi, Fatma
    Sellami, Mohamed
    Loulou, Monia
    Kacem, Ahmed Hadj
    2015 IEEE 12TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE), 2015, : 239 - 244
  • [24] Performance Study of Multi-tenant Cloud FPGAs
    Mbongue, Joel Mandebi
    Saha, Sujan Kumar
    Bobda, Christophe
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 168 - 171
  • [25] Accountability management for multi-tenant cloud services
    Masmoudi, Fatma
    Sellami, Mohamed
    Loulou, Monia
    Kacem, Ahmed Hadj
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2019, 10 (02) : 141 - 158
  • [26] Framework for Management of Multi-tenant Cloud Environments
    Beranek, Marek
    Kovar, Vladimir
    Feuerlicht, George
    CLOUD COMPUTING - CLOUD 2018, 2018, 10967 : 309 - 322
  • [27] Elastic Scaling in the Cloud: A Multi-Tenant Perspective
    Rameshan, Navaneeth
    Liu, Ying
    Navarro, Leandro
    Vlassov, Vladimir
    2016 IEEE 36TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW 2016), 2016, : 25 - 30
  • [28] Network Function Virtualization in the Multi-Tenant Cloud
    Yu, Ruozhou
    Xue, Guoliang
    Kilari, Vishnu Teja
    Zhang, Xiang
    IEEE NETWORK, 2015, 29 (03): : 42 - 47
  • [29] A Multi-Tenant Framework for Cloud Container Services
    Zheng, Chao
    Zhuang, Qinghui
    Guo, Fei
    2021 IEEE 41ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2021), 2021, : 359 - 369
  • [30] Cost-Effective Feature Placement of Customizable Multi-Tenant Applications in the Cloud
    Moens, Hendrik
    Truyen, Eddy
    Walraven, Stefan
    Joosen, Wouter
    Dhoedt, Bart
    De Turck, Filip
    JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2014, 22 (04) : 517 - 558