Compiler-Directed Performance Model Construction for Parallel Programs

被引:0
|
作者
Schindewolf, Martin [1 ]
Kramer, David [1 ]
Cintra, Marcelo [2 ]
机构
[1] KIT, Inst Comp Sci & Engn, Haid Und Neu Str 7, D-76131 Karlsruhe, Germany
[2] Univ Edinburgh, Sch Informat, 10 Crichton St Edinburgh EH8 9AB, Edinburgh, Midlothian, Scotland
来源
ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2010, PROCEEDINGS | 2010年 / 5974卷
关键词
PREDICTION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
During the last decade, performance prediction for industrial and scientific workloads on massively parallel high-performance computing systems has been and still is an active research area. Due to the complexity of applications, the approach to deriving an analytical performance model from current workloads becomes increasingly challenging: automatically generated models often suffer from inaccurate performance prediction; manually constructed analytical models show better prediction, but are very labor-intensive. Our approach aims at closing the gap between compiler-supported automatic model construction and the manual analytical modeling of workloads. Commonly, performance-counter values are used to validate the model, so that prediction errors can be determined and quantified. Instead of manually instrumenting the executable for accessing performance counters, we modified the GCC compiler to insert calls to run-time system functions. Added compiler options enable the user to control the instrumentation process. Subsequently, the instrumentation focuses on frequently executed code parts. Similar to established frameworks, a run-time system is used to track the application behavior: traces are generated at run-time, enabling the construction of architecture independent models (using quadratic programming) and, thus, the prediction of larger workloads. In this paper, we introduce our framework and demonstrate its applicability to benchmarks as well as real world numerical workloads. The experiments reveal an average error rate of 9% for the prediction of larger workloads.
引用
收藏
页码:187 / +
页数:2
相关论文
共 50 条
  • [1] Performance potentials of compiler-directed data speculation
    Wu, YF
    Chen, LL
    Ju, R
    Fang, J
    ISPASS: 2003 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2003, : 22 - 31
  • [2] Compiler-directed code restructuring for improving performance of MPSoCs
    Chen, Guilin
    Kandemir, Mahmut
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, 19 (09) : 1201 - 1214
  • [3] Automating Compiler-Directed Autotuning for Phased Performance Behavior
    Rusira, Tharindu
    Hall, Mary
    Basu, Protonu
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1362 - 1371
  • [4] Compiler-directed energy-time tradeoff in MPI programs on DVS-enabled parallel systems
    Yi, Huizhan
    Chen, Juan
    Yang, Xunjun
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2006, 4330 : 927 - +
  • [5] Compiler-directed energy optimization for parallel-disk-based systems
    Son, Seung Woo
    Chen, Guangyu
    Ozturk, Ozcan
    Kandemir, Mahmut
    Choudhary, Alok
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2007, 18 (09) : 1241 - 1257
  • [6] Compiler-directed cache polymorphism
    Hu, JS
    Kandemir, M
    Vijaykrishnan, N
    Irwin, MJ
    Saputra, H
    Zhang, W
    ACM SIGPLAN NOTICES, 2002, 37 (07) : 165 - 174
  • [7] Compiler-directed scratchpad memory management
    Xue, JL
    EMBEDDED SOFTWARE AND SYSTEMS, PROCEEDINGS, 2005, 3820 : 2 - 2
  • [8] Compiler-Directed Page Coloring for Multiprocessors
    Bugnion, E.
    Anderson, J. M.
    Mowry, T. C.
    Rosenblum, M.
    Computer Architecture News, 24
  • [9] Compiler-directed management of instruction accesses
    Chen, G
    Chen, G
    Kadayif, I
    Zhang, W
    Kandemir, M
    Kolcu, I
    Sezer, U
    EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS, 2003, : 459 - 462
  • [10] Techniques for compiler-directed cache coherence
    Choi, L
    Lim, HB
    Yew, PC
    IEEE PARALLEL & DISTRIBUTED TECHNOLOGY, 1996, 4 (04): : 23 - &