Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning

被引:7
|
作者
Martinez, Victor [1 ]
Dupros, Fabrice [2 ]
Castro, Marcio [3 ]
Navaux, Philippe [1 ]
机构
[1] Fed Univ Rio Grande do Sul UFRGS, Informat Inst INF, Porto Alegre, RS, Brazil
[2] Bur Rech Geol & Minieres, Orleans, France
[3] Fed Univ Santa Catarina UFSC, Dept Informat & Stat INE, Florianopolis, SC, Brazil
来源
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017) | 2017年 / 108卷
关键词
machine learning; stencil computation; multi-core; performance model;
D O I
10.1016/j.procs.2017.05.164
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stencil computations are the basis to solve many problems related to Partial Differential Equations (PDEs). Obtaining the best performance with such numerical kernels is a major issue as many critical parameters (architectural features, compiler flags, memory policies, multithreading strategies) must be finely tuned. In this context, auto-tuning methods have been extensively used to improve the overall performance. However, the complexity of current architectures and the large number of optimizations to consider reduce the efficiency of this approach. This paper focuses on the use of Machine Learning to predict the performance of stencil kernels on multi-core architectures. Low-level hardware counters (e.g. cache-misses and TLB misses) on a limited number of executions are used to build our predictive model. We have considered two different kernels (7-point Jacobi and seismic wave modelling) to demonstrate the effectiveness of our approach. Our results show that performance can be predicted and that the best input configuration for stencil problems can be obtained by simulations of hardware counters and performance measurements. (C) 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the International Conference on Computational Science
引用
收藏
页码:305 / 314
页数:10
相关论文
共 50 条
  • [21] Scheduling Techniques for Multi-Core Architectures
    Hatanaka, Akira
    Bagherzadeh, Nader
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, VOLS 1-3, 2009, : 865 - 870
  • [22] Data Marshaling for Multi-core Architectures
    Suleman, M. Aater
    Mutlu, Onur
    Joao, Jose A.
    Khubaib
    Patt, Yale N.
    ISCA 2010: THE 37TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2010, : 441 - 450
  • [23] Machine Learning based Electromigration-aware Scheduler for Multi-core Processors
    Kumar, P. Jagadeesh
    Mini, M. G.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 571 - 580
  • [24] Circuit Partitioning for Multi-Core Quantum Architectures with Deep Reinforcement Learning
    Pastor, Arnau
    Escofet, Pau
    Ben Rached, Sahar
    Alarcon, Eduard
    Barlet-Ros, Pere
    Abadal, Sergi
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [25] Multifrontal Computations on GPUs and Their Multi-core Hosts
    Lucas, Robert F.
    Wagenbreth, Gene
    Davis, Dan M.
    Grimes, Roger
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2010, 2011, 6449 : 71 - +
  • [26] Understanding the Impact of the Interconnection Network Performance of Multi-core Cluster Architectures
    Hamid, Norhazlina
    Walters, Robert
    Wills, Gary
    JOURNAL OF COMPUTERS, 2016, 11 (02) : 132 - 139
  • [27] Performance optimization of the MGB hydrological model for multi-core and GPU architectures
    Freitas, Henrique R. A.
    Mendes, Celso L.
    Ilic, Aleksandar
    ENVIRONMENTAL MODELLING & SOFTWARE, 2022, 148
  • [28] Performance evaluation of evolutionary multi-core and aggressively multi-threaded processor architectures
    Tirumalai, Partha
    Song, Yonghong
    Kalogeropulos, Spiros
    ADVANCES IN COMPUTER SYSTEMS ARCHITECTURE, PROCEEDINGS, 2007, 4697 : 280 - +
  • [29] Branch Prediction Migration for Multi-core Architectures
    Zhang, Tan
    Zhou, Chaobing
    Huang, Libo
    Xiao, Nong
    2017 INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE (NAS), 2017, : 282 - 283
  • [30] Cholesky factorization on SIMD multi-core architectures
    Lemaitre, Florian
    Couturier, Benjamin
    Lacassagne, Lionel
    JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 79 : 1 - 15