A Software Reliability Model for Cloud-Based Software Rejuvenation Using Dynamic Fault Trees

被引:16
作者
Rahme, Jean [1 ]
Xu, Haiping [1 ]
机构
[1] Univ Massachusetts Dartmouth, Comp & Informat Sci Dept, N Dartmouth, MA 02747 USA
关键词
Software aging; software rejuvenation; reliability analysis; dynamic fault tree (DFT); hot spare (HSP) gate; Markov chain; scheduling;
D O I
10.1142/S021819401540029X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Correctly measuring the reliability and availability of a cloud-based system is critical for evaluating its system performance. Due to the promised high reliability of physical facilities provided for cloud services, software faults have become one of the major factors for the failures of cloud-based systems. In this paper, we focus on the software aging phenomenon where system performance may be progressively degraded due to exhaustion of system resources, fragmentation and accumulation of errors. We use a proactive technique, called software rejuvenation, to counteract the software aging problem. The dynamic fault tree (DFT) formalism is adopted to model the system reliability before and during a software rejuvenation process in an aging cloud-based system. A novel analytical approach is presented to derive the reliability function of a cloud-based Hot SPare (HSP) gate, which is further verified using Continuous Time Markov Chains (CTMC) for its correctness. We use a case study of a cloud-based system to illustrate the validity of our approach. Based on the reliability analytical results, we show how cost-effective software rejuvenation schedules can be created to keep the system reliability consistently staying above a predefined critical level.
引用
收藏
页码:1491 / 1513
页数:23
相关论文
共 27 条
[1]  
[Anonymous], 2013, 2013 INT C INF SCI A
[2]  
Barr Jeff., 2011, Building fault-tolerant applications on aws
[3]   Fine grained software degradation models for optimal rejuvenation policies [J].
Bobbio, A ;
Sereno, M ;
Anglano, C .
PERFORMANCE EVALUATION, 2001, 46 (01) :45-62
[4]   Proactive management of software aging [J].
Castelli, V ;
Harper, RE ;
Heidelberger, P ;
Hunter, SW ;
Trivedi, KS ;
Vaidyanathan, K ;
Zeggert, WP .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2001, 45 (02) :311-332
[5]  
Cotroneo D., 2010, 2010 IEEE 2 INT WORK, P1
[6]  
Dohi T, 2000, 2000 PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING, PROCEEDINGS, P77, DOI 10.1109/PRDC.2000.897287
[7]   DYNAMIC FAULT-TREE MODELS FOR FAULT-TOLERANT COMPUTER-SYSTEMS [J].
DUGAN, JB ;
BAVUSO, SJ ;
BOYD, MA .
IEEE TRANSACTIONS ON RELIABILITY, 1992, 41 (03) :363-377
[8]   A RAID-BASED SECURE AND FAULT-TOLERANT MODEL FOR CLOUD INFORMATION STORAGE [J].
Fitch, Daniel ;
Xu, Haiping .
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2013, 23 (05) :627-654
[9]  
Grottke M., 2008, 19 INT S SOFTWARE RE, P1, DOI [DOI 10.1109/ISSREW.2008.5355512, 10.1109/ISSREW.2008.5355512]
[10]   Analysis of software aging in a web server [J].
Grottke, Michael ;
Li, Lei ;
Vaidyanathan, Kalyanaraman ;
Trivedi, Kishor S. .
IEEE TRANSACTIONS ON RELIABILITY, 2006, 55 (03) :411-420