PolarFS: An Ultra-low Latency and Failure Resilient Distributed File System for Shared Storage Cloud Database

被引:78
作者
Cao, Wei
Liu, Zhenjun
Wang, Peng
Chen, Sen
Zhu, Caifeng
Zheng, Song
Wang, Yuhui
Ma, Guoqing
机构
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2018年 / 11卷 / 12期
关键词
D O I
10.14778/3229863.3229872
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
PolarFS is a distributed file system with ultra-low latency and high availability, designed for the POLARDB database service, which is now available on the Alibaba Cloud. PolarFS utilizes a lightweight network stack and I/O stack in user-space, taking full advantage of the emerging techniques like RDMA, NVMe, and SPDK. In this way, the end-toend latency of PolarFS has been reduced drastically and our experiments show that the write latency of PolarFS is quite close to that of local file system on SSD. To keep replica consistency while maximizing I/O throughput for PolarFS, we develop ParallelRaft, a consensus protocol derived from Raft, which breaks Raft's strict serialization by exploiting the out-of-order I/O completion tolerance capability of databases. ParallelRaft inherits the understandability and easy implementation of Raft while providing much better I/O scalability for PolarFS. We also describe the shared storage architecture of PolarFS, which gives a strong support for POLARDB.
引用
收藏
页码:1849 / 1862
页数:14
相关论文
共 36 条
  • [1] [Anonymous], 2007, P LINUX S DTTAW DNTO
  • [2] [Anonymous], 2010, ACM SIGOPS Operating Systems Review, DOI DOI 10.1145/1713254.1713276
  • [3] [Anonymous], 2014, P 11 USENIX C NETW S
  • [4] [Anonymous], 2012, P 10 USENIX C FILE S
  • [5] Balakrishnan Mahesh, 2012, P 9 S NETW SYST DES
  • [6] Barham P., 2003, Operating Systems Review, V37, P164, DOI 10.1145/1165389.945462
  • [7] Borthakur D., 2008, HDFS architecture guide, P53
  • [8] Caulfield Adrian M., 2010, Proceedings 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010), P385, DOI 10.1109/MICRO.2010.33
  • [9] Chandra T, 2007, PODC'07: PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, P398
  • [10] DeCandia Giuseppe, 2007, Operating Systems Review, V41, P205, DOI 10.1145/1323293.1294281