HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments

被引:17
作者
Devarajan, Hariharan [1 ]
Kougkas, Anthony [1 ]
Sun, Xian-He [1 ]
机构
[1] IIT, Dept Comp Sci, Chicago, IL 60616 USA
来源
2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020 | 2020年
基金
美国国家科学基金会;
关键词
hierarchical; multi-tiered; prefetching; middle-ware; server-push; data-centric;
D O I
10.1109/IPDPS47924.2020.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of data-intensive computing, accessing data with a high-throughput and low-latency is more imperative than ever. Data prefetching is a well-known technique for hiding read latency. However, existing solutions do not consider the new deep memory and storage hierarchy and also suffer from under-utilization of prefetching resources and unnecessary evictions. Additionally, existing approaches implement a client-pull model where understanding the application's I/O behavior drives prefetching decisions. Moving towards exascale, where machines run multiple applications concurrently by accessing files in a workflow, a more data-centric approach can resolve challenges such as cache pollution and redundancy. In this study, we present HFetch, a truly hierarchical data prefetcher that adopts a server-push approach to data prefetching. We demonstrate the benefits of such an approach. Results show 10-35% performance gains over existing prefetchers and over 50% when compared to systems with no prefetching.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
[11]  
Chang F, 1999, USENIX ASSOCIATION PROCEEDINGS OF THE THIRD SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '99), P1
[12]  
Cherubini G., 2017, 2017 IEEE INT C DAT
[13]  
Cray, 2016, DAT DOC
[14]   PrefetchML: a Framework for Prefetching and Caching Models [J].
Daniel, Gwendal ;
Sunye, Gerson ;
Cabot, Jordi .
19TH ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS (MODELS'16), 2016, :318-328
[15]  
DDN, 2018, IME BURST BUFF DOC
[16]  
Devarajan H., 2019, DATA PREFETCHING USI
[17]  
Devarajan H., 2019, HCL HERMES CONTAINER
[18]  
Devarajan H., 2019, HIERARCHICAL DATA PR
[19]   Vidya: Performing Code-Block I/O Characterization for Data Access Optimization [J].
Devarajan, Hariharan ;
Kougkas, Anthony ;
Challa, Prajwal ;
Sun, Xian-He .
2018 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2018, :255-264
[20]  
Ding Q, 2007, DYNAM CONT DIS SER B, V14, P1