Cross-Geography Scientific Data Transferring Trends and Behavior

被引:23
作者
Liu, Zhengchun [1 ]
Kettimuthu, Rajkumar [1 ]
Foster, Ian [1 ,2 ]
Rao, Nageswara S. V. [3 ]
机构
[1] Argonne Natl Lab, Lemont, IL 60439 USA
[2] Univ Chicago, Lemont, IL USA
[3] Oak Ridge Natl Lab, Oak Ridge, TN USA
来源
HPDC '18: PROCEEDINGS OF THE 27TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING | 2018年
关键词
GridFTP; Wide area network; File transfer; Usage management;
D O I
10.1145/3208040.3208053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer log data to characterize transfer characteristics, including the nature of the datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights that can help design better data transfer tools, optimize networking and edge resources used for transfers, and improve the performance and experience for end users. Our analysis shows that (i) most of the datasets as well as individual files transferred are very small; (ii) data corruption is not negligible for large data transfers; and (iii) the data transfer nodes utilization is low. Insights gained from our analysis suggest directions for further analysis.
引用
收藏
页码:267 / 278
页数:12
相关论文
共 31 条
[1]   Energy-Aware Data Transfer Algorithms [J].
Alan, Ismail ;
Arslan, Engin ;
Kosar, Tevfik .
PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
[2]   Software as a Service for Data Scientists [J].
Allen, Bryce ;
Bresnahan, John ;
Childers, Lisa ;
Foster, Ian ;
Kandaswamy, Gopi ;
Kettimuthu, Raj ;
Kordas, Jack ;
Link, Mike ;
Martin, Stuart ;
Pickett, Karl ;
Tuecke, Steven .
COMMUNICATIONS OF THE ACM, 2012, 55 (02) :81-88
[3]  
[Anonymous], 2005, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, DOI DOI 10.1109/SC.2005.72
[4]  
Aspera Inc, 2018, ASP HIGH SPEED FIL T
[5]  
CERN, 2018, GRID FILE ACCESS LIB
[6]  
CERN, 2018, FTS3 ROB SIMPL HIGH
[7]  
CERN, 2018, WORLDW LHC COMP GRID
[8]  
Fuhrmann P, 2006, LECT NOTES COMPUT SC, V4128, P1106
[9]  
Globus org, 2018, US STAT COLL GLOB AL
[10]  
Hu Kejia, 2013, Machine Learning and Data Mining in Pattern Recognition. 9th International Conference, MLDM 2013. Proceedings: LNCS 7988, P601, DOI 10.1007/978-3-642-39712-7_46