Adaptive checkpointing with reliable storage in cloud environment

被引:1
作者
Meroufel B. [1 ]
Belalem G. [1 ]
机构
[1] Department of Computer Science, Faculty of Exact and Applied Sciences, University of Oran - Es Senia, Oran
关键词
availability; Checkpointing; cloud; consistency; coordination; fault tolerance; reliability; replication; storage;
D O I
10.3233/MGS-170270
中图分类号
TP33 [电子数字计算机(不连续作用电子计算机)];
学科分类号
081201 ;
摘要
Cloud computing has recently gained popularity as a resource platform for on-demand, high availability and high scalability access to resources, while offering dynamic flexible infrastructures and QoS (Quality of Services) guaranteed services. In this environment, reliability is very important to ensure the effectiveness of the system and meet the desired SLA (System Level Agreement). Our work proposed in this paper uses the checkpointing of tasks to ensure the reliability and the replication of checkpointing files to ensure the accessibility. To ensure the reliability of services, we used a fault tolerance strategy based on adaptive checkpointing of two levels. This strategy takes into account the complexity and characteristics of cloud computing and it minimizes the overhead. To ensure the accessibility and the availability of checkpointing storage, we used a dynamic passive replication based on an availability degree specified by SLA criteria to decide the placement and the number of replicas. © 2017 - IOS Press and the authors. All rights reserved.
引用
收藏
页码:253 / 268
页数:15
相关论文
共 30 条
  • [1] Feller E., Spahn J.-M., Schoettner M., Morin C., Independent checkpointing in a heterogeneous grid environment, Future Generation Comp Syst, 28, 1, pp. 163-170, (2012)
  • [2] Mostefaoui A., Raynal M., Efficient message logging for uncoordinated checkpointing protocols, Proceedings of the Second European Dependable Computing, pp. 353-364, (1996)
  • [3] Elnozahy E.N., Zwaenepoel W., Manetho: Transparent rollback-recovery with low overhead limited rollback and fast output commit, IEEE Transactions on Computers, 41, 5, pp. 526-531, (1992)
  • [4] Chandy K.M., Lamport L., Distributed snapshots: Determining global states of distributed systems, ACM Transactions on Computer Systems (TOCS), 3, 1, pp. 63-75, (1985)
  • [5] Prakash R., Singhal M., Low-cost checkpointing and failure recovery in mobile computing systems, IEEE Transactions on Parallel and Distributed Systems, 7, 10, pp. 1035-1048, (1996)
  • [6] Tunga H., Datta J., Mitra R., A fast and efficient non-blocking coordinated movement-based check pointing approach for distributed systems, International Journal of Computational Engineering Research (IJCER), 2, 1, pp. 136-142, (2012)
  • [7] Manivannan D., Jiang Q., Yang J., Singhal M., A quasi-synchronous checkpointing algorithm that prevents contention for stable storage, Inf Sci, 178, 15, pp. 3110-3117, (2008)
  • [8] Suri P.-K., Satiza M., System progress estimation in time based coordinated checkpointing protocols, International Journal of Computer Applications, 52, 11, pp. 1-6, (2012)
  • [9] Hui H., Zhan Z., Ling W.-B., Cheng Z.-D., Zong Y.-X., A two-level application transparent checkpointing scheme in cloud computing environment, International Journal of Database Theory and Application, 6, 2, pp. 61-71, (2013)
  • [10] Ndiaye N.-M., Sens P., Thiare O., Performance comparison of hierarchical checkpoint protocols grid computing, International Journal of Interactive Multimedia and Artificial Intelligence, 1, 5, pp. 46-53, (2012)