Advanced Thread Synchronization for Multithreaded MPI Implementations

被引：10

作者：

Hoang-Vu Dang ^{[1
]}

Seo, Sangmin ^{[2
]}

Amer, Abdelhalim ^{[2
]}

Balaji, Pavan ^{[2
]}

机构：

[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA

[2] Argonne Natl Lab, Math & Comp Sci Div, 9700 S Cass Ave, Argonne, IL 60439 USA

来源：

2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) | 2017年

基金：

美国国家科学基金会;

关键词：

MPI; threads; OpenMP; thread safety; lock; mutex; synchronization;

D O I：

10.1109/CCGRID.2017.65

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Concurrent multithreaded access to the Message Passing Interface (MPI) is gaining importance to support emerging hybrid MPI applications. The interoperability between threads and MPI, however, is complex and renders efficient implementations nontrivial. Prior studies showed that threads waiting for communication progress (waiting threads) often interfere with others (active threads) and degrade their progress. This situation occurs when both classes of threads compete for the same MPI resource and ownership passing to waiting threads does not guarantee communication to advance. The best-known practical solution prioritizes active threads and adapts first-infirst-out arbitration within each class. This approach, however, suffers from residual wasted resource acquisitions (waste) and ignores data locality, thus resulting in poor scalability. In this work, we propose thread synchronization improvements to eliminate waste while preserving data locality in a production MPI implementation. First, we leverage MPI knowledge and a fast synchronization method to eliminate waste and accelerate progress. Second, we rely on a cooperative progress model that dynamically elects and restricts a single waiting thread to drive a communication context for improved data locality. Third, we prioritize active threads and synchronize them with a localitypreserving lock that is hierarchical and exploits unbounded bias for high throughput. Results show significant improvement in synthetic microbenchmarks and two MPI+OpenMP applications.

引用

页码：314 / 324

页数：11

共 50 条

[1] Test suite for evaluating performance of MPI implementations that support MPI_THREAD_MULTIPLE
Thakur, Rajeev
Gropp, William
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2007, 4757 : 46 - 55
[2] Survey of MPI Implementations
Hafeez, Mehnaz
Asghar, Sajjad
Malik, Usman Ahmad
Rehman, Adeel Ur
Riaz, Naveed
DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND ITS APPLICATIONS, PT II, 2011, 167 (02): : 206 - +
[3] Thread Assignment of Multithreaded Network Applications in Multicore/Multithreaded Processors
Radojkovic, Petar
Cakarevic, Vladimir
Verdu, Javier
Pajuelo, Alex
Cazorla, Francisco J.
Nemirovsky, Mario
Valero, Mateo
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (12) : 2513 - 2525
[4] Eliminating contention bottlenecks in multithreaded MPI
Dang, Hoang-Vu
Snir, Marc
Gropp, William
PARALLEL COMPUTING, 2017, 69 : 1 - 23
[5] Lock Contention Management in Multithreaded MPI
Amer, Abdelhalim
Lu, Huiwei
Balaji, Pavan
Chabbi, Milind
Wei, Yanjie
Hammond, Jeff
Matsuoka, Satoshi
ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018, 5 (03)
[6] Finepoints: Partitioned Multithreaded MPI Communication
Grant, Ryan E.
Dosanjh, Matthew G. F.
Levenhagen, Michael J.
Brightwell, Ron
Skjellum, Anthony
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2019, 2019, 11501 : 330 - 350
[7] Testing the correctness of MPI implementations
Keller, Rainer
Resch, Michael
ISPDC 2006: FIFTH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, PROCEEDINGS, 2006, : 291 - +
[8] A comparison of three MPI implementations
Vinter, B
Bjorndalen, JM
Anshus, OJ
Larsen, T
COMMUNICATING PROCESS ARCHITECTURES 2004, 2004, 62 : 127 - 136
[9] Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs
Patinyasakdikul, Thananon
Eberius, David
Bosilca, George
Hjelm, Nathan
2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2019, : 246 - 256
[10] Thread communication over MPI
Nitsche, T
RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2000, 1908 : 145 - 151

← 1 2 3 4 5 →