Improving Small File I/O Performance for Massive Digital Archives

被引:4
作者
Kim, Hwajung [1 ,2 ]
Yeom, Heonyoung [1 ]
机构
[1] Seoul Natl Univ, Dept Comp Sci & Engn, Seoul, South Korea
[2] Samsung Elect Co, Software R&D Ctr, Suwon, South Korea
来源
2017 IEEE 13TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE) | 2017年
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/eScience.2017.39
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the growth of online services, a large amount of files have been generated by users or by the service itself. To make it easier to service users with different network environments and devices, online services usually keep different versions of the same file with various sizes. For users with high speed network and top of the line displays, a large size file with high precision can be supplied while users with mobile devices typically receive a smaller file with less precision. In some cases, a large file can be divided into small files to make it easier to transmit over the wide area networks. As a result, underlying filesystem should efficiently maintain a large number of small files. Providing such a huge number of files to applications is one of new challenges of existing filesystems. In this paper, we propose techniques to efficiently manage a large number of files in digital archives using data characteristics and access patterns of the application. Based on the knowledge we have of the upper layer applications, we have modified both in-memory and on-disk inode structure of the existing filesystem and were able to dramatically reduce the number of storage I/O operations to service the same files. Our experimental results show that the proposed methods significantly reduce the number of storage I/O operations both for reading and writing files, especially for small-sized ones. Moreover, we demonstrated that proposed techniques reduce the application-level latency as well as improve file operation throughput, using several synthetic-and micro-benchmarks.
引用
收藏
页码:256 / 265
页数:10
相关论文
共 15 条
  • [1] [Anonymous], 2014, 11 USENIX S OPERATIN
  • [2] [Anonymous], 1997, Tech. Rep. TR3022
  • [3] [Anonymous], 2010, OSDI
  • [4] Bovet DanielP., 2000, Understanding the Linux Kernel
  • [5] Fan XK, 2012, CLOUD COMPUTING, P247
  • [6] An IoT-Oriented Data Storage Framework in Cloud Computing Platform
    Jiang, Lihong
    Xu, Li Da
    Cai, Hongming
    Jiang, Zuhai
    Bu, Fenglin
    Xu, Boyi
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2014, 10 (02) : 1443 - 1451
  • [7] Lensing P., 2010, Proceedings of the 2010 International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI 2010), P33, DOI 10.1109/SNAPI.2010.12
  • [8] Lensing P. H, 2013, P 6 ANN INT SYST STO
  • [9] The log-structured merge-tree (LSM-tree)
    ONeil, P
    Cheng, E
    Gawlick, D
    ONeil, E
    [J]. ACTA INFORMATICA, 1996, 33 (04) : 351 - 385
  • [10] IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion
    Ren, Kai
    Zheng, Qing
    Patil, Swapnil
    Gibson, Garth
    [J]. SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 237 - 248