Efficient Intranode Communication in GPU-Accelerated Systems

被引:1
作者
Ji, Feng [1 ]
Aji, Ashwin M. [2 ]
Dinan, James [3 ]
Buntinas, Darius [3 ]
Balaji, Pavan [3 ]
Feng, Wu-chun [2 ]
Ma, Xiaosong [1 ,4 ]
机构
[1] North Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA USA
[3] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
[4] Oak Ridge Natl Lab, Div Math & Comp Sci, Oak Ridge, TN USA
来源
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW) | 2012年
基金
美国国家科学基金会;
关键词
IMPLEMENTATION;
D O I
10.1109/IPDPSW.2012.227
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) and require programmers to explicitly move data between memory spaces. This approach is inefficient, especially for intranode communication where it can result in several extra copy operations. In this work, we integrate GPU-awareness into a popular MPI runtime system and develop techniques to significantly reduce the cost of intranode communication involving one or more GPUs. Experiment results show an up to 2x increase in bandwidth, resulting in an average of 4.3% improvement to the total execution time of a halo exchange benchmark.
引用
收藏
页码:1838 / 1847
页数:10
相关论文
共 15 条
[1]  
[Anonymous], 2011, CUDA SDK VERSION 4 0
[2]  
[Anonymous], 2011, OSU Micro-benchmarks 3.5
[3]  
[Anonymous], 2011, NVIDIA CUDA C PROGR
[4]   Data transfers between processes in an SMP system: Performance study and application to MPI [J].
Buntinas, Darius ;
Mercier, Guillaume ;
Gropp, William .
2006 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDINGS, 2006, :487-494
[5]  
Buntinas D, 2006, LECT NOTES COMPUT SC, V4192, P86
[6]  
Danalis A., 2010, P 3 WORKSH GEN PURP, P63, DOI [10.1145/1735688.1735702, DOI 10.1145/1735688.1735702]
[7]  
Gabriel E., 2004, RECENT ADV PARALLEL, P353, DOI [10.1007/978-3-540-30218-6_19, DOI 10.1007/978-3-540-30218-6_19]
[8]  
Khronos Group, OPENCL 1 2
[9]  
McVoy L, 1996, PROCEEDINGS OF THE USENIX 1996 ANNUAL TECHNICAL CONFERENCE, P279
[10]  
Message Passing Interface Forum, 2009, MPI MESS PASS INT ST