A Machine Learning Approach Towards Runtime Optimisation of Matrix Multiplication

被引:3
|
作者
Xia, Yufan [1 ]
De La Pierre, Marco [2 ]
Barnard, Amanda S. [1 ]
Barca, Giuseppe Maria Junior [1 ]
机构
[1] Australian Natl Univ, Canberra, Australia
[2] Pawsey Supercomp Res Ctr, Perth, Australia
来源
2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS | 2023年
关键词
GEMM; BLAS; Machine learning; BLIS; MKL; Linear Algebra; Multiple threads; ALGEBRA; ALGORITHMS; GEMM;
D O I
10.1109/IPDPS54959.2023.00059
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The GEneral Matrix Multiplication (GEMM) is one of the essential algorithms in scientific computing. Single-thread GEMM implementations are well-optimised with techniques like blocking and autotuning. However, due to the complexity of modern multi-core shared memory systems, it is challenging to determine the number of threads that minimises the multi-thread GEMM runtime. We present a proof-of-concept approach to building an Architecture and Data-Structure Aware Linear Algebra (ADSALA) software library that uses machine learning to optimise the runtime performance of BLAS routines. More specifically, our method uses a machine learning model on-the-fly to automatically select the optimal number of threads for a given GEMM task based on the collected training data. Test results on two different HPC node architectures, one based on a two-socket Intel Cascade Lake and the other on a two-socket AMD((R)) Zen 3, revealed a 25 to 40 per cent speedup compared to traditional GEMM implementations in BLAS when using GEMM of memory usage within 100 MB.
引用
收藏
页码:524 / 534
页数:11
相关论文
共 50 条
  • [21] Characterizing Machine Learning-Based Runtime Prefetcher Selection
    Alcorta, Erika S.
    Madhav, Mahesh
    Afoakwa, Richard
    Tetrick, Scott
    Yadwadkar, Neeraja J.
    Gerstlauer, Andreas
    IEEE COMPUTER ARCHITECTURE LETTERS, 2024, 23 (02) : 146 - 149
  • [22] Towards Practical Fast Matrix Multiplication based on Trilinear Aggregation
    Hadas, Tor
    Schwartz, Oded
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON SYMBOLIC & ALGEBRAIC COMPUTATION, ISSAC 2023, 2023, : 289 - 297
  • [23] Machine Learning Predictions for Underestimation of Job Runtime on HPC System
    Guo, Jian
    Nomura, Akihiro
    Barton, Ryan
    Zhang, Haoyu
    Matsuoka, Satoshi
    SUPERCOMPUTING FRONTIERS, SCFA 2018, 2018, 10776 : 179 - 198
  • [24] Towards a new approach to predict business performance using machine learning
    Song, Yue-gang
    Cao, Qi-lin
    Zhang, Chen
    COGNITIVE SYSTEMS RESEARCH, 2018, 52 : 1004 - 1012
  • [25] A machine learning approach towards the differentiation between interoceptive and exteroceptive attention
    Zuo, Zoey X.
    Price, Cynthia J.
    Farb, Norman A. S.
    EUROPEAN JOURNAL OF NEUROSCIENCE, 2023, 58 (02) : 2523 - 2546
  • [26] Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach
    Bayram, Firas
    Ahmed, Bestoun s.
    ACM COMPUTING SURVEYS, 2025, 57 (05)
  • [27] Machine learning in cardiovascular risk assessment: Towards a precision medicine approach
    Wang, Yifan
    Aivalioti, Evmorfia
    Stamatelopoulos, Kimon
    Zervas, Georgios
    Mortensen, Martin Bodtker
    Zeller, Marianne
    Liberale, Luca
    Di Vece, Davide
    Schweiger, Victor
    Camici, Giovanni G.
    Luescher, Thomas F.
    Kraler, Simon
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2025, 55
  • [28] Runtime and memory consumption analyses for machine learning R programs
    Kotthaus, Helena
    Korb, Ingo
    Lang, Michel
    Bischl, Bernd
    Rahnenfuehrer, Joerg
    Marwedel, Peter
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (01) : 14 - 29
  • [29] Using Machine Learning to Estimate Utilization and Throughput for OpenCL-Based Matrix-Vector Multiplication (MVM)
    Naher, Jannatun
    Gloster, Clay
    Doss, Christopher C.
    Jadhav, Shrikant S.
    2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 365 - 372
  • [30] Machine learning techniques for sequential learning engineering design optimisation
    Humphrey, L. R.
    Dubas, A. J.
    Fletcher, L. C.
    Davis, A.
    PLASMA PHYSICS AND CONTROLLED FUSION, 2024, 66 (02)