A Method of Self-adaptive Pre-copy Container Checkpoint

被引:1
作者
Chen, Xiao [1 ]
Jiang, Jian-Hui [1 ]
Jiang, Qu [1 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai 201804, Peoples R China
来源
2015 IEEE 21ST PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC) | 2015年
关键词
Cloud computing; Container checkpoint; Self-adaptive; Pre-copy; Downtime;
D O I
10.1109/PRDC.2015.11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Container checkpoint is a kind of backward recovery fault tolerance technology, through which the high availability of container can be achieved. Checkpoint downtime is the key performance indicator of container checkpoint system. A long checkpoint downtime will cause users' perceptual interruption in the deployment of the service of the guest operating system in container, which is difficult to accept for the cloud system offering key services. In order to reduce the downtime of container checkpoint, a method of self-adaptive pre-copy container checkpoint is proposed in this paper. Through several rounds of pre-copy, memory pages of container which will not be frequently modified are copied in advance. Only dirty pages which generate in the previous round of pre-copy are saved in every round of pre-copy by freezing container, which reduces the checkpoint downtime. The number of rounds of pre-copy is adaptively determined by the workload of the guest operating system in container. Prototype of Self-Adaptive Pre-copy Diskless Linux Container Checkpoint (SAPCDLCKPT) is implemented based on Linux container (LXC). The experimental result shows that compared with the existing methods, with the constantly increase of container's memory configuration and different kinds of workloads, SAPCDLCKPT achieves lower checkpoint downtime. The highest decrease ratio of checkpoint downtime reaches to 78.24%.
引用
收藏
页码:290 / 300
页数:11
相关论文
共 38 条
[1]  
Bhattiprolu Sukadev, 2008, Operating Systems Review, V42, P104, DOI 10.1145/1400097.1400109
[2]  
Biederman E. W., 2006, P 2006 OTT LIN S, P1
[3]   A New Diskless Checkpointing Approach for Multiple Processor Failures [J].
Chiu, Ge-Ming ;
Chiu, Jane-Ferng .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2011, 8 (04) :481-493
[4]  
Clark C, 2005, USENIX ASSOCIATION PROCEEDINGS OF THE 2ND SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI '05), P273
[5]  
Cully B., 2008, P USENIX S NETW SYST, P161
[6]  
Deconinck G., 1997, ISCC 97, P321
[7]  
DUELL J, 2003, LBNL54941
[8]   A survey of rollback-recovery protocols in message-passing systems [J].
Elnozahy, EN ;
Alvisi, L ;
Wang, YM ;
Johnson, DB .
ACM COMPUTING SURVEYS, 2002, 34 (03) :375-408
[9]  
Foster I., 2008, GRID COMPUTING ENV W, P1, DOI DOI 10.1109/GCE.2008.4738445
[10]   Javelus: A Low Disruptive Approach to Dynamic Software Updates [J].
Gu, Tianxiao ;
Cao, Chun ;
Xu, Chang ;
Ma, Xiaoxing ;
Zhang, Linghao ;
Lu, Jian .
2012 19TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC), VOL 1, 2012, :527-536