τ-Lop: Modeling performance of shared memory MPI

被引：17

作者：

Rico-Gallego, Juan-Antonio ^{[1
]}

Diaz-Martin, Juan-Carlos ^{[2
]}

机构：

[1] Univ Extremadura, Dept Comp Syst Engn & Telemat, Caceres 10003, Spain

[2] Univ Extremadura, Dept Comp & Commun Technol, Caceres 10003, Spain

来源：

PARALLEL COMPUTING | 2015年 / 46卷

关键词：

Formal models; Performance analysis; Parallel algorithms; MPI collectives; COMMUNICATION; LOGP;

D O I：

10.1016/j.parco.2015.02.006

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Formal modeling of the cost of MPI primitives allows a machine independent representation, comparison and performance analysis of their underlying algorithms. Current accepted methods are all the off-springs of LogP, conceived to model the cost of inter-node point-to-point messages in networks of single-processor machines. As new supercomputers are built upon cheap commodity boards with a growing number of cores accessing hierarchical memories, intra-node communication becomes progressively more relevant. Techniques for shared memory communication, such as message segmentation and collectives, not based on point-to-point operations, are substantively different from their inter-node counterparts. This paper unveils the reasons for the poor fit of LogGP and the most recent models in this domain, log(n)P and mlog(n)P, and proposes a new model named tau-Lop, rooted on them, but addressing the challenge of accurately modeling shared memory MPI communications. Broadcast algorithms of mainstream MPI implementations, MPICH and Open MPI, are modeled and analyzed. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：14 / 31

页数：18

共 25 条

[1]

Alexandrov A., 1995, LOGGP INCORPORATING

[2]

[Anonymous], P 11 EUR PVM MPI US

[3]

Benson GD, 2003, LECT NOTES COMPUT SC, V2840, P335

[4]

Buntinas D, 2006, SIXTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, P521

[5]

Cameron K. W., 2003, P 17 INT S PAR DISTR, P482

[6] logn P and log3 P:: Accurate analytical models of point-to-point communication in distributed systems [J].

Cameron, Kirk W. ;

Ge, Rong ;

Sun, Xian-He .

IEEE TRANSACTIONS ON COMPUTERS, 2007, 56 (03) :314-327

[7]

CULLER D, 1993, SIGPLAN NOTICES, V28, P1, DOI 10.1145/173284.155333

[8] LogP - A practice model of parallel computation [J].

Culler, DE ;

Karp, RM ;

Patterson, D ;

Sahay, A ;

Santos, EE ;

Schauser, KE ;

Subramonian, R ;

vonEicken, T .

COMMUNICATIONS OF THE ACM, 1996, 39 (11) :78-85

[9] KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework [J].

Goglin, Brice ;

Moreaud, Stephanie .

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (02) :176-188

[10]

HOCKNEY RW, 1994, PARALLEL COMPUT, V20, P389, DOI 10.1016/0167-8191(94)90095-7

← 1 2 3 →