R3: Record-Replay-Retroaction for Database-Backed Applications

被引:1
|
作者
Li, Qian [1 ]
Kraft, Peter [1 ]
Cafarella, Michael [2 ]
Demiralp, Cagatay [2 ]
Graefe, Goetz [3 ]
Kozyrakis, Christos [1 ]
Stonebraker, Michael [2 ]
Suresh, Lalith [4 ]
Yu, Xiangyao [5 ]
Zaharia, Matei [6 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] MIT, CSAIL, Cambridge, MA USA
[3] Google, New York, NY USA
[4] Feldera, Foster City, CA USA
[5] UW Madison, Madison, WI USA
[6] Univ Calif Berkeley, Berkeley, CA USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 11期
关键词
TRANSACTIONS;
D O I
10.14778/3611479.3611510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Developers would benefit greatly from time travel: being able to faithfully replay past executions and retroactively execute modified code on past events. Currently, replay and retroaction are impractical because they require expensively capturing fine-grained timing information to reproduce concurrent accesses to shared state. In this paper, we propose practical time travel for database-backed applications, an important class of distributed applications that access shared state through transactions. We present R-3, a novel Record-Replay-Retroaction tool. R-3 implements a lightweight interceptor to record concurrency information for applications at transaction-level granularity, enabling replay and retroaction with minimal overhead. We address key challenges in both replay and retroaction. First, we design a novel algorithm for faithfully reproducing application requests running with snapshot isolation, allowing R-3 to support most production DBMSs. Second, we develop a retroactive execution mechanism that provides high fidelity with the original trace while supporting nearly arbitrary code modifications. We demonstrate how R-3 simplifies debugging for real, hard-to-reproduce concurrency bugs from popular open-source web applications. We evaluate R-3 using TPC-C and microservice workloads and show that R-3 always-on recording has a small performance overhead (<25% for point queries but <0.1% for complex transactions like in TPC-C) during normal application execution and that R-3 can retroactively execute bugfixed code over recorded traces within 0.11 - 0.78x of the original execution time.
引用
收藏
页码:3085 / 3097
页数:13
相关论文
empty
未找到相关数据