Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning

被引:7
|
作者
Martinez, Victor [1 ]
Dupros, Fabrice [2 ]
Castro, Marcio [3 ]
Navaux, Philippe [1 ]
机构
[1] Fed Univ Rio Grande do Sul UFRGS, Informat Inst INF, Porto Alegre, RS, Brazil
[2] Bur Rech Geol & Minieres, Orleans, France
[3] Fed Univ Santa Catarina UFSC, Dept Informat & Stat INE, Florianopolis, SC, Brazil
来源
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017) | 2017年 / 108卷
关键词
machine learning; stencil computation; multi-core; performance model;
D O I
10.1016/j.procs.2017.05.164
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stencil computations are the basis to solve many problems related to Partial Differential Equations (PDEs). Obtaining the best performance with such numerical kernels is a major issue as many critical parameters (architectural features, compiler flags, memory policies, multithreading strategies) must be finely tuned. In this context, auto-tuning methods have been extensively used to improve the overall performance. However, the complexity of current architectures and the large number of optimizations to consider reduce the efficiency of this approach. This paper focuses on the use of Machine Learning to predict the performance of stencil kernels on multi-core architectures. Low-level hardware counters (e.g. cache-misses and TLB misses) on a limited number of executions are used to build our predictive model. We have considered two different kernels (7-point Jacobi and seismic wave modelling) to demonstrate the effectiveness of our approach. Our results show that performance can be predicted and that the best input configuration for stencil problems can be obtained by simulations of hardware counters and performance measurements. (C) 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the International Conference on Computational Science
引用
收藏
页码:305 / 314
页数:10
相关论文
共 50 条
  • [1] Scaling and Analyzing the Stencil Performance on Multi-Core and Many-Core Architectures
    Gan, Lin
    Fu, Haohuan
    Xue, Wei
    Xu, Yangtong
    Yang, Chao
    Wang, Xinliang
    Lv, Zihong
    You, Yang
    Yang, Guangwen
    Ou, Kaijian
    2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 103 - 110
  • [2] Machine Learning Based Performance Prediction for Multi-core Simulation
    Rai, Jitendra Kumar
    Negi, Atul
    Wankar, Rajeev
    MULTI-DISCIPLINARY TRENDS IN ARTIFICIAL INTELLIGENCE, 2011, 7080 : 236 - +
  • [3] Data Compression and Re-computation Based Performance Improvement in Multi-Core Architectures
    Koc, Hakduran
    Garlapati, Mounika
    Madupu, Pranitha P.
    2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 390 - 395
  • [4] A Survey of Approaches used in Parallel Architectures and Multi-core Processors, For Performance Improvement
    Shukla, Surendra Kumar
    Murthy, C. N. S.
    Chande, P. K.
    PROGRESS IN SYSTEMS ENGINEERING, 2015, 366 : 537 - 545
  • [5] Optimization and Performance Modeling of Stencil Computations on ARM Architectures
    Zhang, Kaifang
    Su, Huayou
    Zhang, Peng
    Dou, Yong
    Proceedings - 2020 IEEE 22nd International Conference on High Performance Computing and Communications, IEEE 18th International Conference on Smart City and IEEE 6th International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020, 2020, : 113 - 121
  • [6] High Performance Global Illumination on Multi-core Architectures
    Padron, Emilio J.
    Amor, Margarita
    Doallo, Ramon
    Boo, Montserrat
    PROCEEDINGS OF THE PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2009, : 93 - +
  • [7] Performance issues in emerging homogeneous multi-core architectures
    Kayi, Abdullah
    El-Ghazawi, Tarek
    Newby, Gregory B.
    SIMULATION MODELLING PRACTICE AND THEORY, 2009, 17 (09) : 1485 - 1499
  • [8] Interconnection Network Performance of Multi-core Cluster Architectures
    Hamid, Norhazlina
    Walters, Robert
    Wills, Gary
    2015 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATIONS, AND CONTROL TECHNOLOGY (I4CT), 2015,
  • [9] Understanding the Impact of Cache Performance on Multi-core Architectures
    Ramasubramaniam, N.
    Srinivas, V. V.
    Kumar, P. Pavan
    INFORMATION TECHNOLOGY AND MOBILE COMMUNICATION, 2011, 147 : 403 - 406
  • [10] Optimizing Stencil Computation on Multi-core DSPs
    Zhu, Fugeng
    Fang, Jianbin
    Yu, Kainan
    Qi, Xinxin
    Tang, Tao
    Xie, Jing
    Ren, Jie
    Zhang, Peng
    Che, Yonggang
    Huang, Chun
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 679 - 690