Diagnosing the Causes and Severity of One-Sided Message Contention

被引:0
作者
Tallent, Nathan R. [1 ]
Vishnu, Abhinav [1 ]
Van Dam, Hubertus [1 ]
Daily, Jeff [1 ]
Kerbyson, Darren J. [1 ]
Hoisie, Adolfy [1 ]
机构
[1] Pacific NW Natl Lab, Richland, WA 99352 USA
关键词
Network contention/congestion; one-sided messages; performance analysis; performance modeling; PERFORMANCE;
D O I
10.1145/2688500.2688516
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Two trends suggest network contention for one-sided messages is poised to become a performance problem that concerns application developers: an increased interest in one-sided programming models and a rising ratio of hardware threads to network injection bandwidth. Often it is difficult to reason about when one-sided tasks decrease or increase network contention. We present effective and portable techniques for diagnosing the causes and severity of one-sided message contention. To detect that a message is affected by contention, we maintain statistics representing instantaneous network resource demand. Using lightweight measurement and modeling, we identify the portion of a message's latency that is due to contention and whether contention occurs at the initiator or target. We attribute these metrics to program statements in their full static and dynamic context. We characterize contention for an important computational chemistry benchmark on InfiniBand, Cray Aries, and IBM Blue Gene/Q interconnects. We pinpoint the sources of contention, estimate their severity, and show that when message delivery time deviates from an ideal model, there are other messages contending for the same network links. With a small change to the benchmark, we reduce contention by 50% and improve total runtime by 20%.
引用
收藏
页码:130 / 139
页数:10
相关论文
共 45 条
[1]   HPCTOOLKIT: tools for performance analysis of optimized parallel programs [J].
Adhianto, L. ;
Banerjee, S. ;
Fagan, M. ;
Krentel, M. ;
Marin, G. ;
Mellor-Crummey, J. ;
Tallent, N. R. .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2010, 22 (06) :685-701
[3]  
Alexandrov A., 1995, LOGGP INCORPORATING, DOI DOI 10.1145/215399.215427
[4]  
Alverson Robert, 2010, Proceedings of the 18th IEEE Symposium on High Performance Interconnects (HOTI 2010), P83, DOI 10.1109/HOTI.2010.23
[5]  
[Anonymous], 1998, SIGPLAN FORTRAN FORU, DOI [DOI 10.1145/289918.289920, 10.1145/289918.289920]
[6]  
[Anonymous], COMPETENCE HIGH PERF
[7]  
[Anonymous], 2011, P INT C SUP
[8]  
Barnes PeterD., 2013, P ACM SIGSIM C PRINC, P327
[9]  
Bhatele A., 2013, P 2013 ACM IEEE C SU
[10]   Efficient algorithms for all-to-all communications in multiport message-passing systems [J].
Bruck, J ;
Ho, CT ;
Kipnis, S ;
Upfal, E ;
Weathersby, D .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1997, 8 (11) :1143-1156