Accelerating Messages by Avoiding Copies in an Asynchronous Task-based Programming Model

被引:0
作者
Bhat, Nitin [1 ]
White, Sam [2 ]
Ramos, Evan [1 ]
Kale, Laxmikant, V [1 ,2 ]
机构
[1] Charmworks Inc, Urbana, IL 61801 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL USA
来源
PROCEEDINGS OF SIXTH INTERNATIONAL IEEE WORKSHOP ON EXTREME SCALE PROGRAMMING MODELS AND MIDDLEWARE (ESPM2 2021) | 2021年
关键词
Charm plus; AMPI; RDMA; Parallel Programming; Asynchronous Tasking; Communication Optimizations;
D O I
10.1109/ESPM254806.2021.00007
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Task-based programming models promise improved communication performance for irregular, fine-grained, and load imbalanced applications. They do so by relaxing some of the messaging semantics of stricter models and taking advantage of those at the lower-levels of the software stack. For example, while MPI's two-sided communication model guarantees in-order delivery, requires matching sends to receives, and has the user schedule communication, task-based models generally favor the runtime system scheduling all execution based on the dependencies and message deliveries as they happen. The messaging semantics are critical to enabling high performance. In this paper, we build on previous work that added zero copy semantics to Converse/LRTS. We examine the messaging semantics of Charm++ as it relates to large message buffers, identify shortcomings, and define new communication APIs to address them. Our work enables in-place communication semantics in the context of point-to-point messaging, broadcasts, transmission of read-only variables at program startup, and for migration of chares. We showcase the performance of our new communication APIs using benchmarks for Charm++ and Adaptive MPI, which result in nearly 90% latency improvement and 2x lower peak memory usage.
引用
收藏
页码:10 / 19
页数:10
相关论文
共 13 条
[1]   Parallel Programming with Migratable Objects: Charm plus plus in Practice [J].
Acun, Bilge ;
Gupta, Abhishek ;
Jain, Nikhil ;
Langer, Akhil ;
Menon, Harshitha ;
Mikida, Eric ;
Ni, Xiang ;
Robson, Michael ;
Sun, Yanhua ;
Totoni, Ehsan ;
Wesolowski, Lukasz ;
Kale, Laxmikant .
SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, :647-658
[2]  
Bauer M, 2012, INT CONF HIGH PERFOR
[3]  
Bhat N., 2021, AMTE
[4]   GASNet-EX: A High-Performance, Portable Communication Library for Exascale [J].
Bonachea, Dan ;
Hargrove, Paul H. .
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC 2018), 2019, 11882 :138-158
[5]   Parallel programmability and the Chapel language [J].
Chamberlain, B. L. ;
Callahan, D. ;
Zima, H. P. .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2007, 21 (03) :291-312
[6]  
ElGhazawi T, 2005, WILEY SER PARA DIST, P1, DOI 10.1002/0471478369
[7]   Ownership Passing: Efficient Distributed Memory Programming on Multi-core Systems [J].
Friedley, Andrew ;
Hoefler, Torsten ;
Bronevetsky, Greg ;
Lumsdaine, Andrew ;
Ma, Ching-Chen .
ACM SIGPLAN NOTICES, 2013, 48 (08) :177-186
[8]   Adaptive MPI [J].
Huang, C ;
Lawlor, O ;
Kalé, LV .
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 2004, 2958 :306-322
[9]   ParalleX: An Advanced Parallel Execution Model for Scaling-Impaired Applications [J].
Kaiser, Hartmut ;
Brodowicz, Maciej ;
Sterling, Thomas .
2009 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2009), 2009, :394-401
[10]  
Kamal Humaira, 2010, 11 IEEE INT WORKSH P, P1, DOI DOI 10.1109/IPDPSW.2010.5470773