Mean-field Macro Computation in Large-scale Cloud Service Systems with Resource Management and Job Scheduling

被引:0
作者
Feifei Yang
Yanping Jiang
Quanlin Li
机构
[1] Northeastern University,School of Business Administration
[2] Beijing University of Technology,School of Economics and Management
来源
Journal of Systems Science and Systems Engineering | 2019年 / 28卷
关键词
Large-scale cloud service system; resource management; job scheduling; supermarket model; work stealing model; scheduling of public reserved resource;
D O I
暂无
中图分类号
学科分类号
摘要
Service computing is an emerging and distributed computing mode in cloud service systems, and has become an interesting research direction for both academia and industry. Note that the cloud service systems always display new characteristics, such as stochasticity, large scale, loose coupling, concurrency, non-homogeneity and heterogeneity, thus their load balancing investigation has been more interesting, difficult and challenging until now. By using resource management and job scheduling, this paper proposes an integrated, real-time and dynamic control mechanism for large-scale cloud service systems and their load balancing through combining supermarket models with not only work stealing models but also scheduling of public reserved resource. To this end, this paper provides a novel stochastic model with weak interactions by means of nonlinear Markov processes. To overcome theoretical difficulties growing out of the state explosion in high-dimensional stochastic systems, this paper applies the mean-field theory to develop a macro computational technique in terms of an infinite-dimensional system of mean-field equations. Furthermore, this paper proves the asymptotic independence of the large-scale cloud service system, and show how to compute the fixed point by virtue of an infinite-dimensional system of nonlinear equations. Based on the fixed point, this paper provides effective numerical computation for performance analysis of this system under a high approximate precision. Therefore, we hope that the methodology and results given in this paper can be applicable to the study of more general large-scale cloud service systems.
引用
收藏
页码:238 / 261
页数:23
相关论文
共 76 条
[1]  
Berenbrink P(2003)The natural work-stealing algorithm is stable SIAM Journal on Computing 32 1260-1279
[2]  
Friedetzky T(1999)Scheduling multi-threaded computations by work stealing Journal of the ACM 46 720-748
[3]  
Goldberg L A(2010)Randomized load balancing with general service time distributions ACM SIGMETRICS Performance Evaluation Review 38 275-286
[4]  
Blumofe R D(2012)Asymptotic independence of queues under randomized load balancing Queueing Systems 71 247-292
[5]  
Leiserson C E(2013)Decay of tails at equilibrium for FIFO join the shortest queue networks The Annals of Applied Probability 23 1841-1878
[6]  
Bramson M(2011)CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms Software: Practice and experience 41 23-50
[7]  
Lu Y(2010)A mean field model of work stealing in large-scale systems ACM SIGMETRICS Performance Evaluation Review 38 13-24
[8]  
Prabhakar B(2000)Chaoticity on path space for a queueing network with selection of the shortest queue among several Journal of Applied Probability 37 198-211
[9]  
Bramson M(2005)Functional central limit theorems for a large network in which customers join the shortest of several queues Probability Theory and Related Fields 131 97-120
[10]  
Lu Y(2011)Performance analysis of cloud computing services for many-tasks scientific computing IEEE Transactions on Parallel and Distributed Systems 22 931-945