ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks

被引:138
|
作者
Prekas, George [1 ]
Kogias, Marios [1 ]
Bugnion, Edouard [1 ]
机构
[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
PROCEEDINGS OF THE TWENTY-SIXTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP '17) | 2017年
关键词
Tail latency; Microsecond-scale computing;
D O I
10.1145/3132747.3132780
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper focuses on the efficient scheduling on multicore systems of very fine-grain networked tasks, which are the typical building block of online data-intensive applications. The explicit goal is to deliver high throughput (millions of remote procedure calls per second) for tail latency service-level objectives that are a small multiple of the task size. We present ZYGOS, a system optimized for mu s-scale, inmemory computing on multicore servers. It implements a work-conserving scheduler within a specialized operating system designed for high request rates and a large number of network connections. ZYGOS uses a combination of shared-memory data structures, multi-queue NICs, and inter-processor interrupts to rebalance work across cores. For an aggressive service-level objective expressed at the 99th percentile, ZYGOS achieves 75% of the maximum possible load determined by a theoretical, zero-overhead model (centralized queueing with FCFS) for 10 mu s tasks, and 88% for 25 mu s tasks. We evaluate ZYGOS with a networked version of Silo, a state-of-the-art in-memory transactional database, running TPC-C. For a service-level objective of 1000 mu s latency at the 99th percentile, ZYGOS can deliver a 1.63x speedup over Linux (because of its dataplane architecture) and a 1.26x speedup over IX, a state-of-the-art dataplane (because of its work-conserving scheduler).
引用
收藏
页码:325 / 341
页数:17
相关论文
共 5 条
  • [1] Achieving Microsecond-Scale Tail Latency Efficiently with Approximate Optimal Scheduling
    Iyer, Rishabh
    Unal, Musa
    Kogias, Marios
    Candea, George
    PROCEEDINGS OF THE TWENTY-NINTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, SOSP 2023, 2023, : 466 - 481
  • [2] Efficient Scheduling Policies for Microsecond-Scale Tasks
    McClure, Sarah
    Ousterhout, Amy
    Shenker, Scott
    Ratnasamy, Sylvia
    PROCEEDINGS OF THE 19TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION (NSDI '22), 2022, : 1 - 18
  • [3] Nu: Achieving Microsecond-Scale Resource Fungibility with Logical Processes
    Ruan, Zhenyuan
    Park, Seo Jin
    Aguilera, Marcos K.
    Belay, Adam
    Schwarzkopf, Malte
    PROCEEDINGS OF THE 20TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, NSDI 2023, 2023, : 1409 - 1427
  • [4] Achieving Low Tail-latency and High Scalability for Serializable Transactions in Edge Computing
    Chen, Xusheng
    Song, Haoze
    Jiang, Jianyu
    Ruan, Chaoyi
    Li, Cheng
    Wang, Sen
    Zhang, Gong
    Cheng, Reynold
    Cui, Heming
    PROCEEDINGS OF THE SIXTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '21), 2021, : 210 - 227
  • [5] Ultrareliable and Low-Latency Wireless Communication: Tail, Risk, and Scale
    Bennis, Mehdi
    Debbah, Merouane
    Poor, H. Vincent
    PROCEEDINGS OF THE IEEE, 2018, 106 (10) : 1834 - 1853