High-availability clusters: A taxonomy, survey, and future directions

被引:8
作者
Somasekaram, Premathas [1 ]
Calinescu, Radu [1 ]
Buyya, Rajkumar [2 ]
机构
[1] Univ York, Dept Comp Sci, Deramore Lane, York YO10 5GH, N Yorkshire, England
[2] Univ Melbourne, Sch Comp & Informat Syst, Cloud Comp & Distributed Syst CLOUDS Lab, Melbourne, Vic, Australia
关键词
Clustering; Dependability; Enterprise system; High availability; High availability clusters; Reliability; CLOUD; REPLICATION; ARCHITECTURE; SYSTEMS;
D O I
10.1016/j.jss.2021.111208
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The delivery of key services in domains ranging from finance and manufacturing to healthcare and transportation is underpinned by a rapidly growing number of mission-critical enterprise applications. Ensuring the continuity of these complex applications requires the use of software-managed infras-tructures called high-availability clusters (HACs). HACs employ sophisticated techniques to monitor the health of key enterprise application layers and of the resources they use, and to seamlessly restart or relocate application components after failures. In this paper, we first describe the manifold uses of HACs to protect essential layers of a critical application and present the architecture of high availability clusters. We then propose a taxonomy that covers all key aspects of HACs-deployment patterns, application areas, types of cluster, topology, cluster management, failure detection and recovery, consistency and integrity, and data synchronisation; and we use this taxonomy to provide a comprehensive survey of the end-to-end software solutions available for the HAC deployment of enterprise applications. Finally, we discuss the limitations and challenges of existing HAC solutions, and we identify opportunities for future research in the area. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:32
相关论文
共 172 条
[1]  
Alahmad Y, 2018, I C COMP SYST APPLIC
[2]  
Amazon Web Services Inc, 2016, SAP AM WEB SERV HIGH
[3]  
Amazon Web Services Inc, 2018, SHARED RESPONSIBILIT
[4]  
[Anonymous], 1999, High performance cluster computing: Architectures and systems (volume 1)
[5]  
[Anonymous], 2008, REMUS HIGH AVAILABIL
[6]  
[Anonymous], 2005, P 2 INT WORKSH OP SY
[7]   Basic concepts and taxonomy of dependable and secure computing [J].
Avizienis, A ;
Laprie, JC ;
Randell, B ;
Landwehr, C .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2004, 1 (01) :11-33
[8]  
Bajohr M, 2008, COMM COM INF SC, V17, P572
[9]  
Barroso L. A., 2009, Synthesis Lect. Comp. Archit., V4, P1, DOI [DOI 10.2200/S00516ED2V01Y201306CAC024, 10.2200/S00516ED2V01Y201306CAC024]
[10]  
Bartkowski S., 2012, IBM REDBOOKS