ARF-Predictor: Effective Prediction of Aging-Related Failure Using Entropy

被引:17
作者
Chen, Pengfei [1 ]
Qi, Yong [1 ]
Li, Xinyi [1 ]
Hou, Di [1 ]
Lyu, Michael Rung-Tsong [2 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Shaanxi, Peoples R China
[2] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Software aging; performance degradation; multi-scale entropy; failure prediction; availability; VARIABLE SELECTION; SOFTWARE; PERFORMANCE; DEGRADATION;
D O I
10.1109/TDSC.2016.2604381
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Even well-designed software systems suffer from chronic performance degradation, also known as "software aging", due to internal (e.g., software bugs) or external (e.g., resource exhaustion) impairments. These chronic problems often fly under the radar of software monitoring systems before causing severe impacts (e.g., system failures). Therefore, it is a challenging issue how to timely predict the occurrence of failures caused by these problems. Unfortunately, the effectiveness of prior approaches are far from satisfactory due to the insufficiency of aging indicators adopted by them. To accurately predict failures caused by software aging which are named as Aging-Related Failure (ARFs), this paper presents a novel entropy-based aging indicator, namely Multidimensional Multi-scale Entropy (MMSE) which leverages the complexity embedded in runtime performance metrics to indicate software aging. To the best of our knowledge, this is the first time to leverage entropy to predict ARFs. Based upon MMSE, we implement three failure prediction approaches encapsulated in a proof-of-concept prototype named ARF-Predictor. The experimental evaluations in a Video on Demand (VoD) system, and in a real-world production system, AntVision, show that ARF-Predictor can predict ARFs with a very high accuracy and a low Ahead-Time-To-Failure (ATTF). Compared to previous approaches, ARF-Predictor improves the prediction accuracy by about 5 times and reduces ATTF even by 3 orders of magnitude. In addition, ARF-Predictor is light-weight enough to satisfy the real-time requirement.
引用
收藏
页码:675 / 693
页数:19
相关论文
共 60 条
[1]   Multivariate Multiscale Entropy Analysis [J].
Ahmed, Mosabber Uddin ;
Mandic, Danilo P. .
IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (02) :91-94
[2]   Multivariate multiscale entropy: A tool for complexity analysis of multichannel data [J].
Ahmed, Mosabber Uddin ;
Mandic, Danilo P. .
PHYSICAL REVIEW E, 2011, 84 (06)
[3]   Software Rejuvenation - Do IT & Telco Industries Use It? [J].
Alonso, Javier ;
Bovenzi, Antonio ;
Li, Jinghui ;
Wang, Yakun ;
Russo, Stefano ;
Trivedi, Kishor .
23RD IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSRE 2012), 2012, :299-304
[4]   Optimal resource allocation in a virtualized software aging platform with software rejuvenation [J].
Alonso, Javier ;
Goiri, Inigo ;
Guitart, Jordi ;
Gavalda, Ricard ;
Torres, Jordi .
22ND IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2011, :250-259
[5]   Adaptive on-line software aging prediction based on Machine Learning [J].
Alonso, Javier ;
Torres, Jordi ;
Berral, Osep Ll. ;
Gavalda, Ricard .
2010 IEEE-IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS DSN, 2010, :507-516
[6]   Using machine learning for non-intrusive modeling and prediction of software aging [J].
Andrzejak, Artur ;
Silva, Luis .
2008 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, VOLS 1 AND 2, 2008, :25-+
[7]  
[Anonymous], 2010, IEEE 2 INT WORKSH SO
[8]  
[Anonymous], 2005, ENCY STAT BEHAV SCI
[9]  
Araujo J., 2011, P 12 INT MIDDL C TRA, P1103
[10]   Software Aging in the Eucalyptus Cloud Computing Infrastructure: Characterization and Rejuvenation [J].
Araujo, Jean ;
Matos, Rubens ;
Alves, Vandi ;
Maciel, Paulo ;
Vieira de Souza, F. ;
Matias, Rivalino, Jr. ;
Trivedi, Kishor S. .
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2014, 10 (01)