A user-level checkpointing library for POSIX threads programs

被引:11
作者
Dieter, WR [1 ]
Lumpp, JE [1 ]
机构
[1] Univ Kentucky, Dept Elect Engn, Lexington, KY 40506 USA
来源
TWENTY-NINTH ANNUAL INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING, DIGEST OF PAPERS | 1999年
关键词
D O I
10.1109/FTCS.1999.781054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Several user-level checkpointing libraries that checkpoint Unix processes have been developed. However they do not support multithreaded programs. This paper describes a user-level checkpointing library to checkpoint multithreaded programs that Else the POSIX threads library provided by Solaris 2. Experiments with programs from the SPLASH-2 benchmark suite showed a 3% to 10% increase in execution time with checkpointing enabled, plus an additional overhead for saving the program's state. The checkpointing library described here is available at http://www.dcs.uky.edu/(similar to)chkpt/.
引用
收藏
页码:224 / 227
页数:4
相关论文
共 14 条
[1]  
Butenhof D. R., 1997, Programming with POSIX threads
[2]   DISTRIBUTED SNAPSHOTS - DETERMINING GLOBAL STATES OF DISTRIBUTED SYSTEMS [J].
CHANDY, KM ;
LAMPORT, L .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1985, 3 (01) :63-75
[3]  
DIETER WR, 1999, CEG99004 U KENT DEP
[4]  
Elnozahy E. N., 1992, Proceedings 11th Symposium on Reliable Distributed Systems (Cat. No.92CH3187-2), P39, DOI 10.1109/RELDIS.1992.235144
[5]  
ELNOZAHY EN, 1994, P 24 INT S FAULT TOL, P298
[6]  
GRIFFIOEN J, 1995, IEEE COMP SOC B TECH, V7
[7]   CHECKPOINTING AND ROLLBACK-RECOVERY FOR DISTRIBUTED SYSTEMS [J].
KOO, R ;
TOUEG, S .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1987, 13 (01) :23-31
[8]  
Leu P.-J., 1988, Proceedings Fourth International Conference on Data Engineering (Cat. No.88CH2550-2), P154, DOI 10.1109/ICDE.1988.105457
[9]  
MANIVANNAN D, 1997, IEEE T PARALLEL DIST
[10]  
PLANK JS, 1995, USENIX WINT 1995 TEC