Fault tolerance in cloud computing environment: A systematic survey

被引:58
|
作者
Hasan, Moin [1 ]
Goraya, Singh [1 ]
机构
[1] St Longowal Inst Engn & Technol, Dept Comp Sci & Engn, Longowal, Punjab, India
关键词
Cloud computing; Faults and failures; Fault tolerance; Survey; DATA-STORAGE; RELIABILITY; MANAGEMENT; ALGORITHM; MIGRATION; SCHEME;
D O I
10.1016/j.compind.2018.03.027
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Fault tolerance is among the most imperative issues in cloud to deliver reliable services. It is difficult to implement due to dynamic service infrastructure, complex configurations and various interdependencies existing in cloud. Extensive research efforts are consistently being made to implement the fault tolerance in cloud. Implementation of a fault tolerance policy in cloud not only needs specific knowledge of its application domain, but a comprehensive analysis of the background and various prevalent techniques also. Some recent surveys try to assimilate the various fault tolerance architectures and approaches proposed for cloud environment but seem to be limited on some accounts. This paper gives a systematic and comprehensive elucidation of different fault types, their causes and various fault tolerance approaches used in cloud. The paper presents a broad survey of various fault tolerance frameworks in the context of their basic approaches, fault applicability, and other key features. A comparative analysis of the surveyed frameworks is also included in the paper. For the first time, on the basis of an analysis of various fault tolerance frameworks cited in the present paper as well as included in the recently published prime surveys, a quantified view on their applicability is presented. It is observed that primarily the checkpoint-restart and replication oriented fault tolerance techniques are used to target the crash faults in cloud.
引用
收藏
页码:156 / 172
页数:17
相关论文
共 50 条
  • [1] A survey of fault tolerance in cloud computing
    Kumari, Priti
    Kaur, Parmeet
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2021, 33 (10) : 1159 - 1176
  • [2] Fault Tolerance in Cloud Computing - Survey
    Ataallah, Salma M. A.
    Nassar, Salwa M.
    Hemayed, Elsayed E.
    2015 11TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2015, : 241 - 245
  • [3] A survey of fault tolerance architecture in cloud computing
    Cheraghlou, Mehdi Nazari
    Khadem-Zadeh, Ahmad
    Haghparast, Majid
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2016, 61 : 81 - 92
  • [4] Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment
    Shahid, Muhammad Asim
    Islam, Noman
    Alam, Muhammad Mansoor
    Mazliham, M. S.
    Musa, Shahrulniza
    COMPUTER SCIENCE REVIEW, 2021, 40
  • [5] Failover strategy for fault tolerance in cloud computing environment
    Mohammed, Bashir
    Kiran, Mariam
    Maiyama, Kabiru M.
    Kamala, Mumtaz M.
    Awan, Irfan-Ullah
    SOFTWARE-PRACTICE & EXPERIENCE, 2017, 47 (09): : 1243 - 1274
  • [6] A Comprehensive Survey of Fault Tolerance Techniques in Cloud Computing
    Agarwal, Himanshu
    Sharma, Anju
    2015 INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORK COMMUNICATIONS (COCONET), 2015, : 408 - 413
  • [7] Energy efficient fault tolerance techniques in green cloud computing: A systematic survey and taxonomy
    Bharany, Salil
    Badotra, Sumit
    Sharma, Sandeep
    Rani, Shalli
    Alazab, Mamoun
    Jhaveri, Rutvij H.
    Gadekallu, Thippa Reddy
    SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 53
  • [8] Fault tolerance for a scientific workflow system in a Cloud computing environment
    Khaldi M.
    Rebbah M.
    Meftah B.
    Smail O.
    International Journal of Computers and Applications, 2020, 42 (07) : 705 - 714
  • [9] An Integrated Virtualized Strategy for Fault Tolerance in Cloud Computing Environment
    Mohammed, Bashir
    Kiran, Mariam
    Awan, Irfan-Ullah
    Maiyama, Kabiru M.
    2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 542 - 549
  • [10] Survey on Fault-Tolerance-Aware Scheduling in Cloud Computing
    Kathpal, Chesta
    Garg, Ritu
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES, 2019, 40 : 275 - 283