Improving sparse data movement performance using multiple paths on the Blue Gene/Q supercomputer

被引:4
作者
Bui, Huy [1 ]
Jung, Eun-Sung [2 ]
Vishwanath, Venkatram [2 ]
Johnson, Andrew [1 ]
Leigh, Jason [4 ]
Papka, Michael E. [3 ,5 ]
机构
[1] Univ Illinois, Elect Visualizat Lab, 842 Taylor St, Chicago, IL 60607 USA
[2] Argonne Natl Lab, Math & Comp Sci, 9700 S Cass Ave, Argonne, IL 60439 USA
[3] Argonne Natl Lab, Argonne Leadership Comp Facil, 9700 S Cass Ave, Argonne, IL 60439 USA
[4] Univ Hawaii, LAVA, 1680 East West Rd, Honolulu, HI 96822 USA
[5] No Illinois Univ, 300 Normal Rd, De Kalb, IL 60115 USA
基金
美国国家科学基金会;
关键词
Multiple paths; Sparse data movement; Topology-aware aggregation; Data-intensive; Blue Gene/Q;
D O I
10.1016/j.parco.2015.09.002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In situ analysis has been proposed as a promising solution to glean faster insights and reduce the amount of data to storage. A critical challenge here is that the reduced dataset is typically located on a subset of the nodes and needs to be written out to storage. Data coupling in multiphysics codes also exhibits a sparse data movement pattern wherein data movement occurs among a subset of nodes. We evaluate the performance of data movement for sparse data patterns on the IBM Blue Gene/Q supercomputing system "Mira" and identify performance bottlenecks. We propose a multipath data movement algorithm for sparse data patterns based on an adaptation of a maximum flow algorithm together with breadth-first search that fully exploits all the underlying data paths and I/O nodes to improve data movement. We demonstrate the efficacy of our solutions through a set of microbenchmarks and application benchmarks on Mira scaling up to 131,072 compute cores. The results show that our approach achieves up to 5 x improvement in achievable throughput compared with the default mechanisms. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 20 条
[1]  
Ali N., 2009, P CLUSTER, P1
[2]  
[Anonymous], P IEEE INT C CLUST C
[3]  
[Anonymous], P INT C HIGH PERF CO
[4]  
[Anonymous], P INT C HIGH PERF CO
[5]  
[Anonymous], 2012, SC 12 P INT C HIGH P
[6]  
[Anonymous], 1981, P 13 ANN ACM S THEOR
[7]  
[Anonymous], 2014, P TWENTYFIFTH ANN AC
[8]   Scalable parallel I/O on a Blue Gene/Q supercomputer using compression, topology-aware data aggregation, and subfiling [J].
Bui, Huy ;
Leigh, Jason ;
Vishwanath, Venkatram ;
Finkel, Hal ;
Habib, Salman ;
Heitmann, Katrin ;
Papka, Michael ;
Harms, Kevin .
2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, :107-+
[9]  
Ford L., 1987, Classic papers in combinatorics, P243
[10]   Efficient Routing Mechanisms for Dragonfly Networks [J].
Garcia, Marina ;
Vallejo, Enrique ;
Beivide, Ramon ;
Odriozola, Miguel ;
Valero, Mateo .
2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, :582-592