Advanced Thread Synchronization for Multithreaded MPI Implementations

被引:10
|
作者
Hoang-Vu Dang [1 ]
Seo, Sangmin [2 ]
Amer, Abdelhalim [2 ]
Balaji, Pavan [2 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Argonne Natl Lab, Math & Comp Sci Div, 9700 S Cass Ave, Argonne, IL 60439 USA
来源
2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) | 2017年
基金
美国国家科学基金会;
关键词
MPI; threads; OpenMP; thread safety; lock; mutex; synchronization;
D O I
10.1109/CCGRID.2017.65
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Concurrent multithreaded access to the Message Passing Interface (MPI) is gaining importance to support emerging hybrid MPI applications. The interoperability between threads and MPI, however, is complex and renders efficient implementations nontrivial. Prior studies showed that threads waiting for communication progress (waiting threads) often interfere with others (active threads) and degrade their progress. This situation occurs when both classes of threads compete for the same MPI resource and ownership passing to waiting threads does not guarantee communication to advance. The best-known practical solution prioritizes active threads and adapts first-infirst-out arbitration within each class. This approach, however, suffers from residual wasted resource acquisitions (waste) and ignores data locality, thus resulting in poor scalability. In this work, we propose thread synchronization improvements to eliminate waste while preserving data locality in a production MPI implementation. First, we leverage MPI knowledge and a fast synchronization method to eliminate waste and accelerate progress. Second, we rely on a cooperative progress model that dynamically elects and restricts a single waiting thread to drive a communication context for improved data locality. Third, we prioritize active threads and synchronize them with a localitypreserving lock that is hierarchical and exploits unbounded bias for high throughput. Results show significant improvement in synthetic microbenchmarks and two MPI+OpenMP applications.
引用
收藏
页码:314 / 324
页数:11
相关论文
共 50 条
  • [1] Test suite for evaluating performance of MPI implementations that support MPI_THREAD_MULTIPLE
    Thakur, Rajeev
    Gropp, William
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, 2007, 4757 : 46 - 55
  • [2] Survey of MPI Implementations
    Hafeez, Mehnaz
    Asghar, Sajjad
    Malik, Usman Ahmad
    Rehman, Adeel Ur
    Riaz, Naveed
    DIGITAL INFORMATION AND COMMUNICATION TECHNOLOGY AND ITS APPLICATIONS, PT II, 2011, 167 (02): : 206 - +
  • [3] Thread Assignment of Multithreaded Network Applications in Multicore/Multithreaded Processors
    Radojkovic, Petar
    Cakarevic, Vladimir
    Verdu, Javier
    Pajuelo, Alex
    Cazorla, Francisco J.
    Nemirovsky, Mario
    Valero, Mateo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (12) : 2513 - 2525
  • [4] Eliminating contention bottlenecks in multithreaded MPI
    Dang, Hoang-Vu
    Snir, Marc
    Gropp, William
    PARALLEL COMPUTING, 2017, 69 : 1 - 23
  • [5] Lock Contention Management in Multithreaded MPI
    Amer, Abdelhalim
    Lu, Huiwei
    Balaji, Pavan
    Chabbi, Milind
    Wei, Yanjie
    Hammond, Jeff
    Matsuoka, Satoshi
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018, 5 (03)
  • [6] Finepoints: Partitioned Multithreaded MPI Communication
    Grant, Ryan E.
    Dosanjh, Matthew G. F.
    Levenhagen, Michael J.
    Brightwell, Ron
    Skjellum, Anthony
    HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2019, 2019, 11501 : 330 - 350
  • [7] Testing the correctness of MPI implementations
    Keller, Rainer
    Resch, Michael
    ISPDC 2006: FIFTH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, PROCEEDINGS, 2006, : 291 - +
  • [8] A comparison of three MPI implementations
    Vinter, B
    Bjorndalen, JM
    Anshus, OJ
    Larsen, T
    COMMUNICATING PROCESS ARCHITECTURES 2004, 2004, 62 : 127 - 136
  • [9] Give MPI Threading a Fair Chance: A Study of Multithreaded MPI Designs
    Patinyasakdikul, Thananon
    Eberius, David
    Bosilca, George
    Hjelm, Nathan
    2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2019, : 246 - 256
  • [10] Thread communication over MPI
    Nitsche, T
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2000, 1908 : 145 - 151