High performance RDMA protocols in HPC

被引:0
作者
Woodall, Tim S. [1 ]
Shipman, Galen M.
Bosilca, George
Graham, Richard L.
Maccabe, Arthur B.
机构
[1] Los Alamos Natl Lab, Adv Comp Lab, Los Alamos, NM 87545 USA
[2] Univ Tennessee, Dept Comp Sci, Knoxville, TN 37996 USA
[3] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
来源
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE | 2006年 / 4192卷
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern network communication libraries that leverage Remote Directory Memory Access (RDMA) and OS bypass protocols, such as Infiniband [2] and Myrinet [10] can offer significant performance advantages over conventional send/receive protocols. However, this performance often comes with hidden per buffer setup costs [4]. This paper describes a unique long-message MPI [9] library 'pipeline' protocol that addresses these constraints while avoiding some of the pitfalls of existing techniques. By using portable send/receive semantics to hide the cost of initializing the pipeline algorithm, and then effectively overlapping the cost of memory registration with RDMA operations, this protocol provides very good performance for any large-memory usage pattern. This approach avoids the use of non-portable memory hooks or keeping registered memory from being returned to the OS. Through this approach, bandwidth may be increased up to 67% when memory buffers are not effectively reused while providing superior performance in the effective bandwidth benchmark. Several user level protocols are explored using Open MPI's PML (Point to point messaging layer) and compared/contrasted to this 'pipeline' protocol.
引用
收藏
页码:76 / 85
页数:10
相关论文
共 50 条
[21]   High Performance Computing (HPC) Implementation: A Survey [J].
Assiroj, Priati ;
Hananto, April Lia ;
Fauzi, Ahmad ;
Warnars, Harco Leslie Hendric Spits .
2018 INDONESIAN ASSOCIATION FOR PATTERN RECOGNITION INTERNATIONAL CONFERENCE (INAPR), 2018, :213-217
[22]   Maximizing MPI Point-to-Point Communication Performance on RDMA-enabled Clusters with Customized Protocols [J].
Small, Matthew ;
Yuan, Xin .
ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, :306-315
[23]   iCheck: Leveraging RDMA and Malleability for Application-Level Checkpointing in HPC Systems [J].
John, Jophin ;
Araya, Isaac David Nunez ;
Gerndt, Michael .
2022 IEEE 28TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, ICPADS, 2022, :467-474
[24]   An Empirical Study of High Performance Computing (HPC) Performance Bugs [J].
Azad, Md Abul Kalam ;
Iqbal, Nafees ;
Hassan, Foyzul ;
Roy, Probir .
2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, :194-206
[25]   Flor: An Open High Performance RDMA Framework Over Heterogeneous RNICs [J].
Li, Qiang ;
Gao, Yixiao ;
Wang, Xiaoliang ;
Qiu, Haonan ;
Le, Yanfang ;
Liu, Derui ;
Xiang, Qiao ;
Feng, Fei ;
Zhang, Peng ;
Li, Bo ;
Dong, Jianbo ;
Tang, Lingbo ;
Liu, Hongqiang Harry ;
Liu, Shaozong ;
Li, Weijie ;
Miao, Rui ;
Wu, Yaohui ;
Wu, Zhiwu ;
Han, Chao ;
Yan, Lei ;
Cao, Zheng ;
Wu, Zhongjie ;
Tian, Chen ;
Chen, Guihai ;
Cai, Dennis ;
Wu, Jinbo ;
Zhu, Jiaji ;
Wu, Jiesheng ;
Shu, Jiwu .
PROCEEDINGS OF THE 17TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2023, 2023, :931-948
[26]   High performance RDMA-based MPI implementation over InfiniBand [J].
Liu, JX ;
Wu, JS ;
Panda, DK .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2004, 32 (03) :167-198
[27]   High Performance RDMA-Based MPI Implementation over InfiniBand [J].
Jiuxing Liu ;
Jiesheng Wu ;
Dhabaleswar K. Panda .
International Journal of Parallel Programming, 2004, 32 :167-198
[28]   High-Performance Design of Hadoop RPC with RDMA over InfiniBand [J].
Lu, Xiaoyi ;
Islam, Nusrat S. ;
Wasi-ur-Rahman, Md ;
Jose, Jithin ;
Subramoni, Hari ;
Wang, Hao ;
Panda, Dhabaleswar K. .
2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, :641-650
[29]   RXIO: Design and implementation of high performance RDMA-capable GridFTP [J].
Tian, Yuan ;
Yu, Weikuan ;
Vetter, Jeffrey S. .
COMPUTERS & ELECTRICAL ENGINEERING, 2012, 38 (03) :772-784
[30]   Transport protocols for high performance [J].
Falk, A ;
Faber, T ;
Bannister, J ;
Chien, A ;
Grossman, R ;
Leigh, J .
COMMUNICATIONS OF THE ACM, 2003, 46 (11) :42-49