DORADD: Deterministic Parallel Execution in the Era of Microsecond-Scale Computing

被引:0
|
作者
Liu, Zhengqing [1 ]
Unal, Musa [2 ]
Parkinson, Matthew J. [3 ]
Kogias, Marios [1 ]
机构
[1] Imperial Coll London, London, England
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Azure Res, Austin, TX USA
来源
PROCEEDINGS OF THE 2025 THE 30TH ACM SIGPLAN ANNUAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, PPOPP 2025 | 2025年
关键词
parallel execution; determinism; runtime scheduling; TRANSACTIONS;
D O I
10.1145/3710848.3710872
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deterministic parallelism is a key building block for distributed and fault-tolerant systems that offers substantial performance benefits while guaranteeing determinism. By studying existing deterministically parallel systems (DPS), we identify certain design pitfalls, such as batched execution and inefficient runtime synchronization, that preclude them from meeting the demands of mu s-scale and high-throughput distributed systems deployed in modern datacenters. We present DORADD, a deterministically parallel runtime with low latency and high throughput, designed for modern datacenter services. DORADD introduces a hybrid scheduling scheme that effectively decouples request dispatching from execution. It employs a single dispatcher to deterministically construct a dynamic dependency graph of incoming requests and worker pools that can independently execute requests in a work-conserving and synchronization-free manner. Furthermore, DORADD overcomes the single-dispatcher throughput bottleneck based on core pipelining. We use DORADD to build an in-memory database and compare it with Caracal, the current state-of-the-art deterministic database, via the YCSB and TPC-C benchmarks. Our evaluation shows up to 2.5x better throughput and more than 150x and 300x better tail latency in non-contended and contended cases, respectively. We also compare DO-RADD with Caladan, the state-of-the-art non-deterministic remote procedure call (RPC) scheduler, and demonstrate that determinism in DORADD does not incur any performance overhead.
引用
收藏
页码:282 / 296
页数:15
相关论文
共 50 条
  • [1] KRCORE: A Microsecond-scale RDMA Control Plane for Elastic Computing
    Wei, Xingda
    Lu, Fangming
    Chen, Rong
    Chen, Haibo
    PROCEEDINGS OF THE 2022 USENIX ANNUAL TECHNICAL CONFERENCE, 2022, : 121 - 136
  • [2] Microsecond-Scale Core Reallocation
    Queue, 2023, 21 (02): : 3 - 4
  • [3] Efficient Scheduling Policies for Microsecond-Scale Tasks
    McClure, Sarah
    Ousterhout, Amy
    Shenker, Scott
    Ratnasamy, Sylvia
    PROCEEDINGS OF THE 19TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '22), 2022, : 1 - 18
  • [4] RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers
    Zhu, Hang
    Kaffes, Kostis
    Chen, Zixu
    Liu, Zhenming
    Kozyrakis, Christos
    Stoica, Ion
    Jin, Xin
    PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), 2020, : 1225 - 1240
  • [5] uBFT: Microsecond-Scale BFT using Disaggregated Memory
    Aguilera, Marcos K.
    Ben-David, Naama
    Guerraoui, Rachid
    Murat, Antoine
    Xygkis, Athanasios
    Zablotchi, Igor
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, VOL 2, ASPLOS 2023, 2023, : 862 - 877
  • [6] Ultrafast cooling reveals microsecond-scale biomolecular dynamics
    Polinkovsky, Mark E.
    Gambin, Yann
    Banerjee, Priya R.
    Erickstad, Michael J.
    Groisman, Alex
    Deniz, Ashok A.
    NATURE COMMUNICATIONS, 2014, 5
  • [7] Aquifer: Transparent Microsecond-Scale Scheduling for vRAN Workloads
    Jia, Yunshan
    Zhong, Yinmin
    Wang, Meng
    Gao, Jiaqi
    Zhang, Pengyu
    Liu, Xuanzhe
    Jin, Xin
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 3171 - 3184
  • [8] Ultrafast cooling reveals microsecond-scale biomolecular dynamics
    Mark E. Polinkovsky
    Yann Gambin
    Priya R. Banerjee
    Michael J. Erickstad
    Alex Groisman
    Ashok A. Deniz
    Nature Communications, 5
  • [9] Nu: Achieving Microsecond-Scale Resource Fungibility with Logical Processes
    Ruan, Zhenyuan
    Park, Seo Jin
    Aguilera, Marcos K.
    Belay, Adam
    Schwarzkopf, Malte
    PROCEEDINGS OF THE 20TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 2023, 2023, : 1409 - 1427
  • [10] Realizing RotorNet: Toward Practical Microsecond-scale Optical Networking
    Mellette, William M.
    Forencich, Alex
    Athapathu, Rukshani
    Snoeren, Alex C.
    Papen, George
    Porter, George
    PROCEEDINGS OF THE 2024 ACM SIGCOMM 2024 CONFERENCE, ACM SIGCOMM 2024, 2024, : 392 - 414