Service Cost Effective and Reliability Aware Job Scheduling Algorithm on Cloud Computing Systems

被引:3
作者
Tang, Xiaoyong [1 ]
Liu, Yi [1 ]
Zeng, Zeng [2 ]
Veeravalli, Bharadwaj [3 ]
机构
[1] Changsha Univ Sci & Technol, Sch Comp & Commun Engn, Changsha 410114, Peoples R China
[2] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117576, Singapore
基金
中国国家自然科学基金;
关键词
Cloud computing systems; service; reliability; cost; job scheduling; VIRTUAL MACHINE PLACEMENT; RESOURCE-MANAGEMENT; ENERGY-EFFICIENT; TIME; CONSOLIDATION; TASKS; MODEL;
D O I
10.1109/TCC.2021.3137323
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, increasing number of services are provided to individuals and organizations through cloud computing systems in a pay-as-you-use model. This business service paradigm encounters several cloud Quality of Service (QoS) challenges, such as reliability, cost, and response time. The most common mechanism to improve cloud service reliability is a primary/backup (PB) faulttolerant technique. However, this reliability enhancement technique inevitably results in multiple replications, which lead to high service cost. In recognition of these challenges, we first build a cloud computing systems resources management architecture. Then, we analyze the cloud service execution reliability on the physical resources of a VM and used a CUDA (Compute Unified Device Architecture)enabled parallel two-dimensional long short-termmemory neural network to predict the software faults of a cloud VM. Third, we propose an effective primary/backup cloud service cost calculation approach. To overcome the cloud service response time constraint, we integrate a response time slack factor into this method. Fourth, we formulate the cloud service reliability and cost aware job scheduling problem, which aims at minimizing the total cloud service cost and rejection rate, and improving the systemreliability. Fifthly, a heuristic greedy reliability and cost aware job scheduling (RCJS) algorithm is proposed. Finally, a performance evaluation is conducted and the experimental results demonstrate that our proposed RCJS algorithm significantly outperforms optimal redundant VM placement (OPVMP), MIN-MIN algorithms in terms of average service cost and rejection rate. This algorithm also demonstrates good trade-off of reliability when compared to the other two algorithms and is suitable for cloud services with high reliability and low-cost requirements.
引用
收藏
页码:1461 / 1473
页数:13
相关论文
共 40 条
[31]   Large-Scale Computing Systems Workload Prediction Using Parallel Improved LSTM Neural Network [J].
Tang, Xiaoyong .
IEEE ACCESS, 2019, 7 :40525-40533
[32]   An effective reliability-driven technique of allocating tasks on heterogeneous cluster systems [J].
Tang, Xiaoyong ;
Li, Kenli ;
Liao, Guiping .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2014, 17 (04) :1413-1425
[33]   Cost and fault-tolerant aware resource management for scientific workflows using hybrid instances on clouds [J].
Vinay, K. ;
Kumar, S. M. Dilip ;
Raghavendra, S. ;
Venugopal, K. R. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (08) :10171-10193
[34]   Software reliability prediction using a deep learning model based on the RNN encoder-decoder [J].
Wang, Jinyong ;
Zhang, Ce .
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2018, 170 :73-82
[35]   Cost Effective, Reliable and Secure Workflow Deployment over Federated Clouds [J].
Wen, Zhenyu ;
Cala, Jacek ;
Watson, Paul ;
Romanovsky, Alexander .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017, 10 (06) :929-941
[36]   Quantitative Fault-Tolerance for Reliable Workflows on Heterogeneous IaaS Clouds [J].
Xie, Guoqi ;
Zeng, Gang ;
Li, Renfa ;
Li, Keqin .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (04) :1223-1236
[37]   A Qos-Driven Approach to the Cloud Service Addressing Attributes of Security [J].
Xu, Han ;
Qiu, Xiwei ;
Sheng, Yongpan ;
Luo, Liang ;
Xiang, Yanping .
IEEE ACCESS, 2018, 6 :34477-34487
[38]   Long-Term QoS-Aware Cloud Service Composition Using Multivariate Time Series Analysis [J].
Ye, Zhen ;
Mistry, Sajib ;
Bouguettaya, Athman ;
Dong, Hai .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2016, 9 (03) :382-393
[39]   Cloud Service Reliability Enhancement via Virtual Machine Placement Optimization [J].
Zhou, Ao ;
Wang, Shangguang ;
Cheng, Bo ;
Zheng, Zibin ;
Yang, Fangchun ;
Chang, Rong N. ;
Lyu, Michael R. ;
Buyya, Rajkumar .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017, 10 (06) :902-913
[40]   Fault-Tolerant Scheduling for Real-Time Scientific Workflows with Elastic Resource Provisioning in Virtualized Clouds [J].
Zhu, Xiaomin ;
Wang, Ji ;
Guo, Hui ;
Zhu, Dakai ;
Yang, Laurence T. ;
Liu, Ling .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (12) :3501-3517