Hardware topology management in MPI applications through hierarchical communicators

被引:4
作者
Goglin, Brice [1 ]
Jeannot, Emmanuel [1 ]
Mansouri, Farouk [2 ]
Mercier, Guillaume [1 ]
机构
[1] Univ Bordeaux, INRIA, LaBRI, CNRS,Bordeaux INP, 200 Ave Vieille Tour, F-33405 Talence, France
[2] DDN Storage, 10 Rue Andras Beck, F-92360 Meudon, France
关键词
Hierarchy; Hardware topology; Message Passing; PROCESS PLACEMENT;
D O I
10.1016/j.parco.2018.05.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The MPI standard is a major contribution in the landscape of parallel programming. Since its inception in the mid 90s it has ensured portability and performance for parallel applications on a wide spectrum of machines and architectures. With the advent of multicore machines, understanding and taking into account the underlying physical topology and memory hierarchy have become of paramount importance. On the other hand, providing abstract mechanisms to manipulate the hardware topology is also fundamental. The MPI standard in its current state, however, and despite recent evolutions is still unable to offer mechanisms to achieve this. In this paper, we detail several additions to the standard for building new MPI communicators corresponding to hardware hierarchy levels. It provides the user with tools to address hardware topology and locality issues while improving application performance. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:70 / 90
页数:21
相关论文
共 24 条
  • [1] [Anonymous], 2001, Technical report
  • [2] Bernholclt D.E., 2017, EXAMPI WORKSH UNPUB
  • [3] Rank reordering for MPI communication optimization
    Brandfass, B.
    Alrutz, T.
    Gerhold, T.
    [J]. COMPUTERS & FLUIDS, 2013, 80 : 372 - 380
  • [4] Broquedis F., 2010, P 18 EUR INT C PAR D
  • [5] Buttlar D.A., 1996, Pthreads Programming
  • [6] CULLER D, 1993, SIGPLAN NOTICES, V28, P1, DOI 10.1145/173284.155333
  • [7] Goglin B., 2014, IEEE COMPUTER SOC, P216, DOI [10.1109/ICPPW.2014.38, DOI 10.1109/ICPPW.2014.38]
  • [8] Hatazaki T, 1998, LECT NOTES COMPUT SC, V1497, P188, DOI 10.1007/BFb0056575
  • [9] HOCKNEY RW, 1994, PARALLEL COMPUT, V20, P389, DOI 10.1016/0167-8191(94)90095-7
  • [10] Locality-Aware Parallel Process Mapping for Multi-Core HPC Systems
    Hursey, Joshua
    Squyres, Jeffrey M.
    Dontje, Terry
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2011, : 527 - 531