iTransformer: Using SSD to Improve Disk Scheduling for High-performance I/O

被引:32
|
作者
Zhang, Xuechen [1 ]
Davis, Kei [2 ]
Jiang, Song [1 ]
机构
[1] Wayne State Univ, ECE Dept, Detroit, MI 48202 USA
[2] Los Alamos Natl Lab, CCS Div, Los Alamos, NM 87545 USA
来源
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS) | 2012年
基金
美国国家科学基金会;
关键词
Disk Scheduler; Solid State Drive; Shared Storage Systems;
D O I
10.1109/IPDPS.2012.70
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The parallel data accesses inherent to large-scale data-intensive scientific computing require that data servers handle very high I/O concurrency. Concurrent requests from different processes or programs to hard disk can cause disk head thrashing between different disk regions, resulting in unacceptably low I/O performance. Current storage systems either rely on the disk scheduler at each data server, or use SSD as storage, to minimize this negative performance effect. However, the ability of the scheduler to alleviate this problem by scheduling requests in memory is limited by concerns such as long disk access times, and potential loss of dirty data with system failure. Meanwhile, SSD is too expensive to be widely used as the major storage device in the HPC environment. We propose iTransformer, a scheme that employs a small SSD to schedule requests for the data on disk. Being less space-constrained than with more expensive DRAM, iTransformer can buffer larger amounts of dirty data before writing it back to the disk, or prefetch a larger volume of data in a batch into the SSD. In both cases high disk efficiency can be maintained even for concurrent requests. Furthermore, the scheme allows the scheduling of requests in the background to hide the cost of random disk access behind serving process requests. Finally, as a non-volatile memory, concerns about the quantity of dirty data are obviated. We have implemented iTransformer in the Linux kernel and tested it on a large cluster running PVFS2. Our experiments show that iTransformer can improve the I/O throughput of the cluster by 35% on average for MPI/IO benchmarks of various data access patterns.
引用
收藏
页码:715 / 726
页数:12
相关论文
共 50 条
  • [1] Hierarchical Collective I/O Scheduling for High-Performance Computing
    Liu, Jialin
    Zhuang, Yu
    Chen, Yong
    BIG DATA RESEARCH, 2015, 2 (03) : 117 - 126
  • [2] Efficient I/O Performance-Focused Scheduling in High-Performance Computing
    Kim, Soeun
    Kim, Sunggon
    Kim, Hwajung
    APPLIED SCIENCES-BASEL, 2024, 14 (21):
  • [3] Disk-Cache and Parallelism Aware I/O Scheduling to Improve Storage System Performance
    Prabhakar, Ramya
    Kandemir, Mahmut
    Jung, Myoungsoo
    IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 357 - 368
  • [4] TECHNIQUES FOR SCHEDULING I/O IN A HIGH-PERFORMANCE MULTIMEDIA-ON-DEMAND SERVER
    JADAV, D
    SRINILTA, C
    CHOUDHARY, A
    BERRA, PB
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1995, 30 (02) : 190 - 203
  • [5] High performance disk I/O system
    Jin, Chao
    Zhou, Feng
    Zheng, Wei-Min
    Ruan Jian Xue Bao/Journal of Software, 2002, 13 (SUPPL.): : 93 - 99
  • [6] Using Transparent Compression to Improve SSD-based I/O Caches
    Makatos, Thanos
    Klonatos, Yannis
    Marazakis, Manolis
    Flouris, Michail D.
    Bilas, Angelos
    EUROSYS'10: PROCEEDINGS OF THE EUROSYS 2010 CONFERENCE, 2010, : 1 - 14
  • [7] High-performance data mining with intelligent SSD
    Jo, Yong-Yeon
    Kim, Sang-Wook
    Cho, Sung-Woo
    Bae, Duck-Ho
    Oh, Hyunok
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (02): : 1155 - 1166
  • [8] High-performance data mining with intelligent SSD
    Yong-Yeon Jo
    Sang-Wook Kim
    Sung-Woo Cho
    Duck-Ho Bae
    Hyunok Oh
    Cluster Computing, 2017, 20 : 1155 - 1166
  • [9] Prototyping on using a DIMM slot as a high-performance I/O interface
    Tanabe, N
    Hamada, Y
    Mitsuhashi, A
    Nakajo, H
    Yamamoto, J
    Imashiro, H
    Kudoh, T
    Amano, H
    INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, 2003, : 108 - 116
  • [10] Using Centralized I/O Scheduling Service(CISS) to Improve Cloud Object Storage Performance
    Shi, Xiao
    Hu, Detian
    Tang, Hongwei
    Zheng, Xiaohui
    Zhao, Xiaofang
    2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 361 - 368