TAUoverSupermon; Low-overhead Online parallel performance monitoring

被引:0
|
作者
Nataraj, Aroon [1 ]
Sottile, Matthew [2 ]
Morris, Alan [1 ]
Malony, Allen D. [1 ]
Shende, Sameer [1 ]
机构
[1] Univ Oregon, Dept Comp & Informat Sci, Eugene, OR 97403 USA
[2] Los Alamos Natl Lab, Los Alamos, NM USA
来源
EURO-PAR 2007 PARALLEL PROCESSING, PROCEEDINGS | 2007年 / 4641卷
关键词
online performance measurement; cluster monitoring;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Online application performance monitoring allows tracking performance characteristics during execution as opposed to doing so post-mortem. This opens up several possibilities otherwise unavailable such as real-time visualization and application performance steering that can be useful in the context of long-running applications. As HPC systems grow in size and complexity, the key challenge is to keep the online performance monitor scalable and low overhead while still providing a useful performance reporting capability. Two fundamental components that constitute such a performance monitor are the measurement and transport systems. We adapt and combine two existing, mature systems - TAU and Supermon - to address this problem. TAU performs the measurement while Supermon is used to collect the distributed measurement state. Our experiments show that this novel approach leads to very low-overhead application monitoring as well as other benefits unavailable from using a transport such as NFS.
引用
收藏
页码:85 / +
页数:3
相关论文
共 50 条
  • [41] A low-overhead networking mechanism for virtualized high-performance computing systems
    Jae-Wan Jang
    Euiseong Seo
    Heeseung Jo
    Jin-Soo Kim
    The Journal of Supercomputing, 2012, 59 : 443 - 468
  • [42] sRDMA: A General and Low-Overhead Scheduler for RDMA
    Wang, Xizheng
    Wang, Shuai
    Li, Dan
    PROCEEDINGS OF THE 7TH ASIA-PACIFIC WORKSHOP ON NETWORKING, APNET 2023, 2023, : 21 - 27
  • [43] Low-overhead inline deduplication for persistent memory
    Chen, Wande
    Chen, Zhenke
    Li, Dingding
    Liu, Hai
    Tang, Yong
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2021, 32 (08)
  • [44] Evaluation of a Low-Overhead Forwarding Algorithm for Platooning
    Larsson, Marcus
    Warg, Fredrik
    Karlsson, Kristian
    Jonsson, Magnus
    2015 IEEE INTERNATIONAL CONFERENCE ON VEHICULAR ELECTRONICS AND SAFETY (ICVES), 2015, : 48 - 55
  • [45] Low-Overhead Control Channels in Wireless Networks
    Chai, Eugene
    Shin, Kang G.
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2015, 14 (11) : 2302 - 2315
  • [46] Low-overhead quantum computing with the color code
    Thomsen, Felix
    Kesselring, Markus S.
    Bartlett, Stephen D.
    Brown, Benjamin J.
    PHYSICAL REVIEW RESEARCH, 2024, 6 (04):
  • [47] Low-overhead message tracking for distributed messaging
    Jun, Seung
    Astley, Mark
    MIDDLEWARE 2006, PROCEEDINGS, 2006, 4290 : 363 - 381
  • [48] A Low-Overhead Integrated Aging and SEU Sensor
    Rohbani, Nezam
    Miremadi, Seyed-Ghassem
    IEEE TRANSACTIONS ON DEVICE AND MATERIALS RELIABILITY, 2018, 18 (02) : 205 - 213
  • [49] A Low-Overhead Dynamic Optimization Framework for Multicores
    Fletcher, Christopher W.
    Harding, Rachael
    Khan, Omer
    Devadas, Srinivas
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'12), 2012, : 467 - 468
  • [50] Low-overhead core swapping for thermal management
    Kursun, E
    Reinman, G
    Sair, S
    Shayesteh, A
    Sherwood, T
    POWER-AWARE COMPUTER SYSTEMS, 2005, 3471 : 46 - 60