StageFS: A Parallel File System Optimizing Metadata Performance for SSD Based Clusters

被引:0
|
作者
Wu, Huijun [1 ]
Zhu, Liming [1 ]
Wu, Dongyao [1 ]
Lu, Kai [2 ]
Li, Gen [2 ]
机构
[1] Univ New South Wales, Data61, CSIRO, Kensington, NSW, Australia
[2] Natl Univ Def Technol, Changsha, Hunan, Peoples R China
来源
2016 IEEE TRUSTCOM/BIGDATASE/ISPA | 2016年
基金
美国国家科学基金会;
关键词
parallel file system; metadata; LSM-tree; small file;
D O I
10.1109/TrustCom.2016.328
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Parallel file systems are important infrastructures for both cloud and high performance computing. The performance of metadata operations is critical to achieve high scalability in parallel file systems. Nevertheless, traditional parallel file systems are lack of scalable metadata service. To alleviate these problems, some previous research distributes metadata to separated large-scale clusters and uses write-optimized techniques like log-structured merge tree (LSM-tree) to store metadata. However, LSM-tree design does not consider the features of solid state drive devices (SSD) which are widely deployed in modern parallel computing systems. The design of using LSM-trees to store metadata has not explored the potential benefits of SSD devices. In this paper, we present StageFS, which is a parallel file system optimized for SSD based clusters. StageFS stores both the metadata and small files in LSM-trees for fast indexing. For larger files, the file blocks are separately stored to reduce the write amplifications. In addition, the parallel I/O feature of SSD devices is used to improve the performance of accessing directories and large files. To avoid frequent small writes, StageFS uses buffering to better utilize the bandwidth of SSD devices. Experimental results show that StageFS provides better performance in metadata operations (up to 21.28x) and small file access (1.92x to two orders of magnitude) compared with Ceph and HDFS.
引用
收藏
页码:2147 / 2152
页数:6
相关论文
共 25 条
  • [1] Optimizing a hybrid SSD/HDD HPC storage system based on file size distributions
    Welch, Brent
    Noer, Geoffrey
    2013 IEEE 29TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2013,
  • [2] Performance Evaluation of A Infiniband-based Lustre Parallel File System
    Wang, Yuan
    Lu, Yongquan
    Qiu, Chu
    Gao, Pengdong
    Wang, Jintao
    2011 2ND INTERNATIONAL CONFERENCE ON CHALLENGES IN ENVIRONMENTAL SCIENCE AND COMPUTER ENGINEERING (CESCE 2011), VOL 11, PT A, 2011, 11 : 316 - 321
  • [3] The Composite-File File System: Decoupling One-to-One Mapping of Files and Metadata for Better Performance
    Zhang, Shuanglong
    Roy, Robert
    Rumancik, Leah
    Wang, An-I Andy
    ACM TRANSACTIONS ON STORAGE, 2020, 16 (01)
  • [4] Dynamic file prefetching scheme based on file access patterns in VIA-based parallel file system
    Lee, YY
    Kim, CY
    Seo, DW
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (04) : 714 - 721
  • [5] Performance evaluation and relative predictive model of parallel file system
    Zhao T.-Z.
    Dong S.-B.
    Verdi M.
    See S.
    Ruan Jian Xue Bao/Journal of Software, 2011, 22 (09): : 2206 - 2221
  • [6] The Research and Implementation of Metadata Cache Backup Technology Based on CEPH File System
    Zhan, Ling
    Fang, Xieyun
    Li, Duping
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016), 2016, : 72 - 77
  • [7] Research on the Metadata Storage Mode and Efficiency of Distributed File System Based on HGML
    Miao Fang
    Cheng Fu-chao
    Yang Wen-hui
    Tan Li
    ADVANCES IN SCIENCE AND ENGINEERING, PTS 1 AND 2, 2011, 40-41 : 221 - 227
  • [8] Facilitating the Efficiency of Secure File Data and Metadata Deletion on SMR-based Ext4 File System
    Chen, Ping-Xiang
    Chen, Shuo-Han
    Chang, Yuan-Hao
    Liang, Yu-Pei
    Shih, Wei-Kuan
    2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 728 - 733
  • [9] A parallel and fault tolerant file system based on NFS servers
    García, F
    Calderón, A
    Carretero, J
    Pérez, JM
    Fernández, J
    ELEVENTH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PROCEEDINGS, 2003, : 83 - 90
  • [10] RAMA: An easy-to-use, high-performance parallel file system
    Miller, EL
    Katz, RH
    PARALLEL COMPUTING, 1997, 23 (4-5) : 419 - 446