HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments

被引:17
作者
Devarajan, Hariharan [1 ]
Kougkas, Anthony [1 ]
Sun, Xian-He [1 ]
机构
[1] IIT, Dept Comp Sci, Chicago, IL 60616 USA
来源
2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020 | 2020年
基金
美国国家科学基金会;
关键词
hierarchical; multi-tiered; prefetching; middle-ware; server-push; data-centric;
D O I
10.1109/IPDPS47924.2020.00017
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of data-intensive computing, accessing data with a high-throughput and low-latency is more imperative than ever. Data prefetching is a well-known technique for hiding read latency. However, existing solutions do not consider the new deep memory and storage hierarchy and also suffer from under-utilization of prefetching resources and unnecessary evictions. Additionally, existing approaches implement a client-pull model where understanding the application's I/O behavior drives prefetching decisions. Moving towards exascale, where machines run multiple applications concurrently by accessing files in a workflow, a more data-centric approach can resolve challenges such as cache pollution and redundancy. In this study, we present HFetch, a truly hierarchical data prefetcher that adopts a server-push approach to data prefetching. We demonstrate the benefits of such an approach. Results show 10-35% performance gains over existing prefetchers and over 50% when compared to systems with no prefetching.
引用
收藏
页码:62 / 72
页数:11
相关论文
共 50 条
[1]   DataStager: scalable data staging services for petascale applications [J].
Abbasi, Hasan ;
Wolf, Matthew ;
Eisenhauer, Greg ;
Klasky, Scott ;
Schwan, Karsten ;
Zheng, Fang .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2010, 13 (03) :277-290
[2]  
Akter S, 2012, VALUING CLIMATE CHANGE MITIGATION: APPLYING STATED PREFERENCES IN THE PRESENCE OF UNCERTAINTY, P49
[3]  
[Anonymous], P 25 S OP SYST PRINC
[4]  
[Anonymous], 2006, MEMORY ACCESS WHITE
[5]  
Barton E., 2015, DAOS ARCHITECTURE EX
[6]  
Berriman G., 2008, Astronomical Data Analysis Software and Systems ASP, V394
[7]  
Bharathi S, 2008, 2008 THIRD WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE (WORKS 2008), P11
[8]   Have Abstraction and Eat Performance, Too: Optimized Heterogeneous Computing with Parallel Patterns [J].
Brown, Kevin J. ;
Lee, HyoukJoong ;
Rompf, Tiark ;
Sujeeth, Arvind K. ;
De Sa, Christopher ;
Aberger, Christopher ;
Olukotun, Kunle .
PROCEEDINGS OF CGO 2016: THE 14TH INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2016, :194-205
[9]  
Cao P., 1995, Performance Evaluation Review, V23, P188, DOI 10.1145/223586.223608
[10]   Understanding and Improving Computational Science Storage Access through Continuous Characterization [J].
Carns, Philip ;
Harms, Kevin ;
Allcock, William ;
Bacon, Charles ;
Lang, Samuel ;
Latham, Robert ;
Ross, Robert .
ACM TRANSACTIONS ON STORAGE, 2011, 7 (03)