A modular approach to build a hardware testbed for cloud resource management research

被引:0
作者
Lucia Pons
Salvador Petit
Julio Pons
María E. Gómez
Julio Sahuquillo
机构
[1] Universitat Politècnica de València,
来源
The Journal of Supercomputing | 2024年 / 80卷
关键词
Cloud computing; High-performance computing; Resource management; Virtualization; Experimental framework;
D O I
暂无
中图分类号
学科分类号
摘要
Research on resource management focuses on optimizing system performance and energy efficiency by distributing shared resources like processor cores, caches, and main memory among competing applications. This research spans a wide range of applications, including those from high-performance computing, machine learning, and mobile computing. Existing research frameworks often simplify research by concentrating on specific characteristics, such as the architecture of the computing nodes, resource monitoring, and representative workloads. For instance, this is typically the case with cloud systems, which introduce additional complexity regarding hardware and software requirements. To avoid this complexity during research, experimental frameworks are being developed. Nevertheless, proposed frameworks often fail regarding the types of nodes included, virtualization support, and management of critical shared resources. This paper presents Stratus, an experimental framework that overcomes these limitations. Stratus includes different types of nodes, a comprehensive virtualization stack, and the ability to partition the major shared resources of the system. Even though Stratus was originally conceived to perform cloud research, its modular design allows Stratus to be extended, broadening its research use on different computing domains and platforms, matching the complexity of modern cloud environments, as shown in the case studies presented in this paper.
引用
收藏
页码:10552 / 10583
页数:31
相关论文
共 76 条
[1]  
Netto MAS(2018)HPC cloud for scientific and business applications: taxonomy, vision, and research challenges ACM Comput Surv 51 1-29
[2]  
Calheiros RN(2016)SLA guarantees for cloud services Futur Gener Comput Syst 54 233-246
[3]  
Rodrigues ER(2020)Rusty: runtime interference-aware predictive monitoring for modern multi-tenant systems IEEE Trans Parallel Distrib Syst 32 184-198
[4]  
Cunha RLF(2023)Cloud white: detecting and estimating QoS degradation of latency-critical workloads in the public cloud Futur Gener Comput Syst 138 13-25
[5]  
Buyya R(2022)Effect of hyper-threading in latency-critical multithreaded cloud applications and utilization analysis of the major system resources Futur Gener Comput Syst 131 194-208
[6]  
Serrano D(2020)Memory-aware resource management algorithm for low-energy cloud data centers Futur Gener Comput Syst 113 329-342
[7]  
Bouchenak S(2022)Less provisioning: a hybrid resource scaling engine for long-running services with tail latency guarantees IEEE Trans Cloud Comput 10 1941-1957
[8]  
Kouki Y(2013)The tail at scale Commun ACM 56 74-80
[9]  
de Oliveira Jr FA(2012)OpenStack: toward an open-source solution for cloud computing Int J Comput Appl 55 38-42
[10]  
Ledoux T(2003)Xen and the art of virtualization ACM SIGOPS Op Syst Rev 37 164-177