Towards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters

被引:1
|
作者
Medya, Sourav [1 ,2 ]
Cherkasova, Ludmila [2 ]
Magalhaes, Guilherme [3 ]
Ozonat, Kivanc [2 ]
Padmanabha, Chaitra [3 ]
Sarma, Jiban [3 ]
Sheikh, Imran [3 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
[2] Hewlett Packard Labs, Palo Alto, CA 94304 USA
[3] Hewlett Packard Enterprise, Palo Alto, CA USA
关键词
D O I
10.1145/2851553.2858669
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many HPC and modern Big Data processing applications belong to a class of so-called scale-out applications, where the application dataset is partitioned and processed by a cluster of machines. Understanding and assessing the scalability of the designed application is one of the primary goals during the application implementation. Typically, in the design and implementation phase, the programmer is bound to a limited size cluster for debugging and performing profiling experiments. The challenge is to assess the scalability of the designed program for its execution on a larger cluster. While in an increased size cluster, each node needs to process a smaller fraction of the original dataset, the communication volume and communication time might be significantly increased, which could become detrimental and provide diminishing performance benefits. The distributed memory applications exhibit complex behavior: they tend to interleave computations and communications, use bursty transfers, and utilize global synchronization primitives. Therefore, one of the main challenges is the analysis of bandwidth demands due to increased communication volume as a function of a cluster size. In this paper(1), we introduce a novel approach to assess the scalability and performance of a distributed memory program for execution on a large-scale cluster. Our solution involves 1) a limited set of traditional experiments performed in a medium size cluster and 2) an additional set of similar experiments performed with an "interconnect bandwidth throttling" tool, which enables the assessment of the communication demands with respect to available bandwidth. This approach enables a prediction of a cluster size, where a communication cost becomes a dominant component, at which point the performance benefits of the increased cluster lead to a diminishing return. We demonstrate the proposed approach using a popular Graph500 benchmark.
引用
收藏
页码:113 / 116
页数:4
相关论文
共 50 条
  • [1] Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters
    Hoisie, A
    Lubeck, O
    Wasserman, H
    FRONTIERS '99 - THE SEVENTH SYMPOSIUM ON THE FRONTIERS OF MASSIVELY PARALLEL COMPUTATION, PROCEEDINGS, 1999, : 4 - 15
  • [2] Identifying Scalability Bottlenecks for Large-Scale Parallel Programs with Graph Analysis
    Jin, Yuyang
    Wang, Haojie
    Tang, Xiongchao
    Hoefler, Torsten
    Liu, Xu
    Zhai, Jidong
    PROCEEDINGS OF THE 25TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING (PPOPP '20), 2020, : 409 - 410
  • [3] Research on the scalability of the large-scale parallel application programs
    Chen, Jun
    Mo, Zeyao
    Li, Xiaomei
    Yuan, Guoxing
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (11): : 1382 - 1388
  • [4] Performance analysis tools for large-scale linux clusters
    Cvetanovic, Z
    2004 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2004, : 361 - 369
  • [5] Scalability analysis of the SPEC OpenMP benchmarks on large-scale shared memory multiprocessors
    Fuerlinger, Karl
    Gerndt, Michael
    Dongarra, Jack
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 2, PROCEEDINGS, 2007, 4488 : 815 - +
  • [6] Optimizing memory transactions for large-scale programs
    Carvalho, Fernando Miguel
    Cachopo, Joao
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 89 : 13 - 24
  • [7] The State of the Art of Metadata Managements in Large-Scale Distributed File Systems - Scalability, Performance and Availability
    Dai, Hao
    Wang, Yang
    Kent, Kenneth B.
    Zeng, Lingfang
    Xu, Chengzhong
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3850 - 3869
  • [8] Large-scale multi-agent mobility simulations on a GPU: towards high performance and scalability
    Saprykin, Aleksandr
    Chokani, Ndaona
    Abhari, Reza S.
    10TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2019) / THE 2ND INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40 2019) / AFFILIATED WORKSHOPS, 2019, 151 : 733 - 738
  • [9] Large-scale normal coordinate analysis on distributed memory parallel systems
    Yang, C
    Raghavan, P
    Arrowood, L
    Noid, DW
    Sumpter, BG
    Tuzun, RE
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2002, 16 (04): : 409 - 424
  • [10] Towards a performance management architecture for large-scale distributed systems using RINA
    Thompson, Peter
    Davies, Neil
    2020 23RD CONFERENCE ON INNOVATION IN CLOUDS, INTERNET AND NETWORKS AND WORKSHOPS (ICIN 2020), 2020, : 29 - 34