Speculative execution in a distributed file system

被引:22
作者
Nightingale, Edmund B. [1 ]
Chen, Peter M. [1 ]
Flinn, Jason [1 ]
机构
[1] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48109 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2006年 / 24卷 / 04期
关键词
performance; design; distributed file systems; speculative execution; causality;
D O I
10.1145/1189256.1189258
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Speculator provides Linux kernel support for speculative execution. It allows multiple processes to share speculative state by tracking causal dependencies propagated through interprocess communication. It guarantees correct execution by preventing speculative processes from externalizing output, for example, sending a network message or writing to the screen, until the speculations on which that output depends have proven to be correct. Speculator improves the performance of distributed file systems by masking I/O latency and increasing I/O throughput. Rather than block during a remote operation, a file system predicts the operation's result, then uses Speculator to checkpoint the state of the calling process and speculatively continue its execution based on the predicted result. If the prediction is correct, the checkpoint is discarded; if it is incorrect, the calling process is restored to the checkpoint, and the operation is retried. We have modified the client, server, and network protocol of two distributed file systems to use Speculator. For PostMark and Andrew-style benchmarks, speculative execution results in a factor of 2 performance improvement for NFS over local area networks and an order of magnitude improvement over wide area networks. For the same benchmarks, Speculator enables the Blue File System to provide the consistency of single-copy file semantics and the safety of synchronous I/O, yet still outperform current distributed file systems with weaker consistency and safety.
引用
收藏
页码:361 / 392
页数:32
相关论文
共 38 条
[11]   ARB: A hardware mechanism for dynamic reordering of memory references [J].
Franklin, M ;
Sohi, GS .
IEEE TRANSACTIONS ON COMPUTERS, 1996, 45 (05) :552-571
[12]  
FRASER K, 2003, P 2003 USENIX TECHN, P325
[13]  
HAERDER T, 1983, COMPUT SURV, V15, P287, DOI 10.1145/289.291
[14]  
HAMMOND L, 1998, P 8 INT C ARCH SUPP, P58
[15]   SCALE AND PERFORMANCE IN A DISTRIBUTED FILE SYSTEM [J].
HOWARD, JH ;
KAZAR, ML ;
MENEES, SG ;
NICHOLS, DA ;
SATYANARAYANAN, M ;
SIDEBOTHAM, RN ;
WEST, MJ .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1988, 6 (01) :51-81
[16]  
JEFFERSON D, 1987, P 11 ACM S OP SYST P, P77, DOI [10.1145/41457.37508, DOI 10.1145/41457.37508]
[17]  
JEFFERSON DR, 1985, ACM T PROGR LANG SYS, V7, P404, DOI 10.1145/3916.3988
[18]  
KING ST, 2003, P 19 ACM S OP SYST P, P223
[19]   DISCONNECTED OPERATION IN THE CODA FILE SYSTEM [J].
KISTLER, JJ ;
SATYANARAYANAN, M .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1992, 10 (01) :3-25
[20]   TIME, CLOCKS, AND ORDERING OF EVENTS IN A DISTRIBUTED SYSTEM [J].
LAMPORT, L .
COMMUNICATIONS OF THE ACM, 1978, 21 (07) :558-565