TAUoverSupermon; Low-overhead Online parallel performance monitoring

被引:0
|
作者
Nataraj, Aroon [1 ]
Sottile, Matthew [2 ]
Morris, Alan [1 ]
Malony, Allen D. [1 ]
Shende, Sameer [1 ]
机构
[1] Univ Oregon, Dept Comp & Informat Sci, Eugene, OR 97403 USA
[2] Los Alamos Natl Lab, Los Alamos, NM USA
来源
EURO-PAR 2007 PARALLEL PROCESSING, PROCEEDINGS | 2007年 / 4641卷
关键词
online performance measurement; cluster monitoring;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Online application performance monitoring allows tracking performance characteristics during execution as opposed to doing so post-mortem. This opens up several possibilities otherwise unavailable such as real-time visualization and application performance steering that can be useful in the context of long-running applications. As HPC systems grow in size and complexity, the key challenge is to keep the online performance monitor scalable and low overhead while still providing a useful performance reporting capability. Two fundamental components that constitute such a performance monitor are the measurement and transport systems. We adapt and combine two existing, mature systems - TAU and Supermon - to address this problem. TAU performs the measurement while Supermon is used to collect the distributed measurement state. Our experiments show that this novel approach leads to very low-overhead application monitoring as well as other benefits unavailable from using a transport such as NFS.
引用
收藏
页码:85 / +
页数:3
相关论文
共 50 条
  • [31] Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications
    Kale, Vivek
    Gropp, William D.
    OPENMP: HETEROGENOUS EXECUTION AND DATA MOVEMENTS, IWOMP 2015, 2015, 9342 : 18 - 29
  • [32] LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications
    Xin, Jinhan
    Hwang, Kai
    Yu, Zhibin
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 674 - 684
  • [33] Low-Voltage Low-Overhead Asynchronous Logic
    Sridharan, Akshay
    Sechen, Carl
    Jafari, Roozbeh
    2013 IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2013, : 261 - 266
  • [34] Low-Latency Low-Overhead Zipper Codes
    Karimi, Bashirreza
    Barakatain, Masoud
    Hashemi, Yoones
    Chang, Deyuan
    Ebrahimzad, Hamid
    Li, Chuandong
    2022 EUROPEAN CONFERENCE ON OPTICAL COMMUNICATION (ECOC), 2022,
  • [35] A low-overhead checkpointing protocol for mobile networks
    Ahmed, RE
    Khaliq, A
    CCECE 2003: CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-3, PROCEEDINGS: TOWARD A CARING AND HUMANE TECHNOLOGY, 2003, : 1779 - 1782
  • [36] Low-Overhead Compressibility Prediction for High-Performance Lossless Data Compression
    Kim, Youngil
    Choi, Seungdo
    Lee, Daeyong
    Jeong, Joonyong
    Kwak, Jaewook
    Lee, Jungkeol
    Lee, Gyeongyong
    Lee, Sangjin
    Park, Kibin
    Jeong, Jinwoo
    Kexin, Wang
    Song, Yong Ho
    IEEE ACCESS, 2020, 8 : 37105 - 37123
  • [37] Performance Impact of Magnetic and Thermal Attack on STTRAM and Low-Overhead Mitigation Techniques
    Jang, Jae-Won
    Ghosh, Swaroop
    ISLPED '16: PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2016, : 136 - 141
  • [38] A Low-Overhead Method of Embedded Software Profiling
    Liu Fagui
    Li Shengwen
    Xie Ran
    Luo Chunwei
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL IV, 2009, : 436 - 439
  • [39] Low-Overhead Defect Tolerance in Crossbar Nanoarchitectures
    Tahoori, Mehdi B.
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2009, 5 (02)
  • [40] A low-overhead networking mechanism for virtualized high-performance computing systems
    Jang, Jae-Wan
    Seo, Euiseong
    Jo, Heeseung
    Kim, Jin-Soo
    JOURNAL OF SUPERCOMPUTING, 2012, 59 (01): : 443 - 468