FINE-GRAINED MULTITHREADING SUPPORT FOR HYBRID THREADED MPI PROGRAMMING

被引：34

作者：

Balaji, Pavan ^{[1
]}

Buntinas, Darius ^{[1
]}

Goodell, David ^{[1
]}

Gropp, William ^{[2
]}

Thakur, Rajeev ^{[1
]}

机构：

[1] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA

[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA

来源：

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS | 2010年 / 24卷 / 01期

关键词：

MPI; threads; hybrid programming; fine-grained locks;

D O I：

10.1177/1094342009360206

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As high-end computing systems continue to grow in scale, recent advances in multi-and many-core architectures have pushed such growth toward more dense architectures, that is, more processing elements per physical node, rather than more physical nodes themselves. Although a large number of scientific applications have relied so far on an MPI-everywhere model for programming high-end parallel systems; this model may not be sufficient for future machines, given their physical constraints such as decreasing amounts of memory per processing element and shared caches. As a result, application and computer scientists are exploring alternative programming models that involve using MPI between address spaces and some other threaded model, such as OpenMP, Pthreads, or Intel TBB, within an address space. Such hybrid models require efficient support from an MPI implementation for MPI messages sent from multiple threads simultaneously. In this paper, we explore the issues involved in designing such an implementation. We present four approaches to building a fully thread-safe MPI implementation, with decreasing levels of critical-section granularity (from coarse-grain locks to fine-grain locks to lock-free operations) and correspondingly increasing levels of complexity. We present performance results that demonstrate the performance implications of the different approaches.

引用

页码：49 / 57

页数：9

共 29 条

[21] Designing Scalable Graph500 Benchmark with Hybrid MPI plus OpenSHMEM Programming Models
Jose, Jithin
Potluri, Sreeram
Tomko, Karen
Panda, Dhabaleswar K.
SUPERCOMPUTING (ISC 2013), 2013, 7905 : 109 - 124
[22] FMCC-RT: a scalable and fine-grained all-reduce algorithm for large-scale SMP clusters
Jintao Peng
Jie Liu
Jianbin Fang
Min Xie
Yi Dai
Zhiquan Lai
Bo Yang
Chunye Gong
Xinjun Mao
Guo Mao
Jie Ren
Science China Information Sciences, 2025, 68 (5)
[23] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
Chao-Chin Wu
Lien-Fu Lai
Chao-Tung Yang
Po-Hsun Chiu
The Journal of Supercomputing, 2012, 60 : 31 - 61
[24] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
Wu, Chao-Chin
Lai, Lien-Fu
Yang, Chao-Tung
Chiu, Po-Hsun
JOURNAL OF SUPERCOMPUTING, 2012, 60 (01) : 31 - 61
[25] Performance-based parallel loop self-scheduling using hybrid OpenMP and MPI programming on multicore SMP clusters
Yang, Chao-Tung
Wu, Chao-Chin
Chang, Jen-Hsiang
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (08) : 721 - 744
[26] Designing parallel loop self-scheduling schemes using the hybrid MPI and OpenMP programming model for multi-core grid systems
Wu, Chao-Chin
Yang, Chao-Tung
Lai, Kuan-Chou
Chiu, Po-Hsun
JOURNAL OF SUPERCOMPUTING, 2012, 59 (01) : 42 - 60
[27] Parallel optimization of three-dimensional wedge-shaped underwater acoustic propagation based on MPI+OpenMP hybrid programming model
Zijie Zhu
Yongxian Wang
Xiaoqian Zhu
Wei Liu
Qiang Lan
Wenbin Xiao
Xinghua Cheng
The Journal of Supercomputing, 2021, 77 : 4988 - 5018
[28] Designing parallel loop self-scheduling schemes using the hybrid MPI and OpenMP programming model for multi-core grid systems
Chao-Chin Wu
Chao-Tung Yang
Kuan-Chou Lai
Po-Hsun Chiu
The Journal of Supercomputing, 2012, 59 : 42 - 60
[29] Parallel optimization of three-dimensional wedge-shaped underwater acoustic propagation based on MPI plus OpenMP hybrid programming model
Zhu, Zijie
Wang, Yongxian
Zhu, Xiaoqian
Liu, Wei
Lan, Qiang
Xiao, Wenbin
Cheng, Xinghua
JOURNAL OF SUPERCOMPUTING, 2021, 77 (05) : 4988 - 5018

← 1 2 3 →