High-availability clusters: A taxonomy, survey, and future directions

被引:8
作者
Somasekaram, Premathas [1 ]
Calinescu, Radu [1 ]
Buyya, Rajkumar [2 ]
机构
[1] Univ York, Dept Comp Sci, Deramore Lane, York YO10 5GH, N Yorkshire, England
[2] Univ Melbourne, Sch Comp & Informat Syst, Cloud Comp & Distributed Syst CLOUDS Lab, Melbourne, Vic, Australia
关键词
Clustering; Dependability; Enterprise system; High availability; High availability clusters; Reliability; CLOUD; REPLICATION; ARCHITECTURE; SYSTEMS;
D O I
10.1016/j.jss.2021.111208
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The delivery of key services in domains ranging from finance and manufacturing to healthcare and transportation is underpinned by a rapidly growing number of mission-critical enterprise applications. Ensuring the continuity of these complex applications requires the use of software-managed infras-tructures called high-availability clusters (HACs). HACs employ sophisticated techniques to monitor the health of key enterprise application layers and of the resources they use, and to seamlessly restart or relocate application components after failures. In this paper, we first describe the manifold uses of HACs to protect essential layers of a critical application and present the architecture of high availability clusters. We then propose a taxonomy that covers all key aspects of HACs-deployment patterns, application areas, types of cluster, topology, cluster management, failure detection and recovery, consistency and integrity, and data synchronisation; and we use this taxonomy to provide a comprehensive survey of the end-to-end software solutions available for the HAC deployment of enterprise applications. Finally, we discuss the limitations and challenges of existing HAC solutions, and we identify opportunities for future research in the area. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:32
相关论文
共 172 条
[21]   Total order broadcast and multicast algorithms:: Taxonomy and survey [J].
Défago, X ;
Schiper, A ;
Urbán, P .
ACM COMPUTING SURVEYS, 2004, 36 (04) :372-421
[22]   Electron: Towards Efficient Resource Management on Heterogeneous Clusters with Apache Mesos [J].
DelValle, Renan ;
Kaushik, Pradyumna ;
Jain, Abhishek ;
Hartog, Jessica ;
Govindaraju, Madhusudhan .
2017 IEEE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2017, :262-269
[23]  
Demchenko Y, 2014, PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), P104, DOI 10.1109/CTS.2014.6867550
[24]  
DH2i, 2020, DXENTERPRISE
[25]   Availability Assessment of HA Standby Redundant Clusters [J].
Distefano, Salvatore ;
Longo, Francesco ;
Scarpa, Marco .
2010 29TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS SRDS 2010, 2010, :265-274
[26]   The transis approach to high availability cluster communication [J].
Dolev, D ;
Malki, D .
COMMUNICATIONS OF THE ACM, 1996, 39 (04) :64-70
[27]   Towards a unified taxonomy and architecture of cloud frameworks [J].
Dukaric, Robert ;
Juric, Matjaz B. .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2013, 29 (05) :1196-1210
[28]   High availability in clouds: systematic review and research challenges [J].
Endo, Patricia T. ;
Rodrigues, Moises ;
Goncalves, Glauco E. ;
Kelner, Judith ;
Sadok, Djamel H. ;
Curescu, Calin .
JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2016, 5
[29]  
Engelmann C, 2008, IEEE ACM INT SYMP, P813, DOI 10.1109/CCGRID.2008.78
[30]  
Engelmann C, 2008, THESIS U READING UK