Reliability Analysis for Software Cluster Systems based on Proportional Hazard Model

被引:0
作者
Hou, Chunyan [1 ]
Chen, Chen [2 ]
Wang, Jinsong [1 ]
Shi, Kai [1 ]
机构
[1] Tianjin Univ Technol, Sch Comp & Commun Engn, Tianjin, Peoples R China
[2] Nankai Univ, Coll Comp & Control Engn, Tianjin, Peoples R China
来源
PROCEEDINGS 2016 IEEE 40TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS, VOL 1 | 2016年
关键词
cluster system; load-sharing system; cumulative workload; software reliability; software aging; PARAMETER-ESTIMATION; LOAD; COMPONENTS; LIFETIME;
D O I
10.1109/COMPSAC.2016.177
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the universal application of software cluster systems, their reliability is drawing more and more attention from academia to industry. A cluster system is a kind of software load-sharing system (LSS) whose reliability is significantly dependent on system software. Therefore, traditional reliability analysis methods for hardware LSSs are not applicable for cluster systems. In this paper, we develop a reliability analysis model for redundant cluster systems consisting of initial servers and cold standby servers used to replace failed ones. System reliability process is modeled with a state-based non-homogeneous Markov process (NHMH), where each state corresponds to a non-homogeneous Poisson processe (NHPP). NHPP arrival rate is expressed using Cox's proportional hazard model (PHM) in terms of cumulative and instantaneous workload of system software. In addition to redundant cluster systems without repair, the model also can be extended to analyze those with restart. The analysis results are meaningful to support cluster management and design decisions. Finally, the evaluation experiments show the potential of our model.
引用
收藏
页码:32 / 41
页数:10
相关论文
共 25 条
[1]   An approach to software reliability prediction based on time series modeling [J].
Amin, Ayman ;
Grunske, Lars ;
Colman, Alan .
JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (07) :1923-1932
[2]  
[Anonymous], P WORKSH SOFTW AG RE
[3]  
[Anonymous], 2010 INT C INF SCI A
[4]  
Avritzer A., 1997, Empirical Software Engineering, V2, P55
[5]  
Cox D.R., 1972, J ROYAL STAT SOC B, V34, P187220
[6]   On systems with shared resources and optimal switching strategies [J].
Finkelstein, Maxim .
RELIABILITY ENGINEERING & SYSTEM SAFETY, 2009, 94 (08) :1358-1362
[7]   A methodology for detection and estimation of software aging [J].
Garg, S ;
van Moorsel, A ;
Vaidyanathan, K ;
Trivedi, KS .
NINTH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 1998, :283-292
[8]   Architecture-based approach to reliability assessment of software systems [J].
Goseva-Popstojanova, K ;
Trivedi, KS .
PERFORMANCE EVALUATION, 2001, 45 (2-3) :179-204
[9]   Analysis of software aging in a web server [J].
Grottke, Michael ;
Li, Lei ;
Vaidyanathan, Kalyanaraman ;
Trivedi, Kishor S. .
IEEE TRANSACTIONS ON RELIABILITY, 2006, 55 (03) :411-420
[10]   Tools and Experiments Supporting a Testing-Based Theory of Component Composition [J].
Hamlet, Dick .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2009, 18 (03)