A Machine Learning Approach Towards Runtime Optimisation of Matrix Multiplication

被引:3
作者
Xia, Yufan [1 ]
De La Pierre, Marco [2 ]
Barnard, Amanda S. [1 ]
Barca, Giuseppe Maria Junior [1 ]
机构
[1] Australian Natl Univ, Canberra, Australia
[2] Pawsey Supercomp Res Ctr, Perth, Australia
来源
2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, IPDPS | 2023年
关键词
GEMM; BLAS; Machine learning; BLIS; MKL; Linear Algebra; Multiple threads; ALGEBRA; ALGORITHMS; GEMM;
D O I
10.1109/IPDPS54959.2023.00059
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The GEneral Matrix Multiplication (GEMM) is one of the essential algorithms in scientific computing. Single-thread GEMM implementations are well-optimised with techniques like blocking and autotuning. However, due to the complexity of modern multi-core shared memory systems, it is challenging to determine the number of threads that minimises the multi-thread GEMM runtime. We present a proof-of-concept approach to building an Architecture and Data-Structure Aware Linear Algebra (ADSALA) software library that uses machine learning to optimise the runtime performance of BLAS routines. More specifically, our method uses a machine learning model on-the-fly to automatically select the optimal number of threads for a given GEMM task based on the collected training data. Test results on two different HPC node architectures, one based on a two-socket Intel Cascade Lake and the other on a two-socket AMD((R)) Zen 3, revealed a 25 to 40 per cent speedup compared to traditional GEMM implementations in BLAS when using GEMM of memory usage within 100 MB.
引用
收藏
页码:524 / 534
页数:11
相关论文
共 50 条
  • [31] Using Machine Learning to Estimate Utilization and Throughput for OpenCL-Based Matrix-Vector Multiplication (MVM)
    Naher, Jannatun
    Gloster, Clay
    Doss, Christopher C.
    Jadhav, Shrikant S.
    2020 10TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2020, : 365 - 372
  • [32] A machine learning approach to synchronization of automata
    Podolak, Igor
    Roman, Adam
    Szykula, Marek
    Zielinski, Bartosz
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 97 : 357 - 371
  • [33] Towards Optimal Multiple Constant Multiplication: A Hypergraph Approach
    Gustafsson, Oscar
    2008 42ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-4, 2008, : 1805 - 1809
  • [34] A machine learning-oriented pseudo-field approach to accelerate runtime of molecular dynamics simulation of liquids
    Khan, Md. Akib
    Morshed, A. K. M. Monjur
    Paul, Titan C.
    MOLECULAR SIMULATION, 2023, 49 (15) : 1442 - 1451
  • [35] IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH
    Ali, Wan Noor Hamiza Wan
    Mohd, Masnizah
    Fauzi, Fariza
    Shirai, Kiyoaki
    Noor, Muhammad Junaidi Mahamad
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2021, : 78 - 100
  • [36] A data-driven machine learning approach for the 3D printing process optimisation
    Nguyen, Phuong Dong
    Nguyen, Thanh Q.
    Tao, Q. B.
    Vogel, Frank
    Nguyen-Xuan, H.
    VIRTUAL AND PHYSICAL PROTOTYPING, 2022, 17 (04) : 768 - 786
  • [37] A Reinforcement Learning Approach to Powertrain Optimisation
    Matallah, Hocine
    Javied, Asad
    Williams, Alexander
    Abdo, Ashraf Fahmy
    Belblidia, Fawzi
    SUSTAINABLE DESIGN AND MANUFACTURING, SDM 2022, 2023, 338 : 252 - 261
  • [38] Machine learning accelerates high throughput design and screening of MOF mixed-matrix membranes towards He separation
    Wu, Jiasheng
    Guo, Yanan
    Liu, Guozhen
    Liu, Gongping
    Jin, Wanqin
    JOURNAL OF MEMBRANE SCIENCE, 2025, 717
  • [39] Analysis of the hyperparameter optimisation of four machine learning satellite imagery classification methods
    Alonso-Sarria, Francisco
    Valdivieso-Ros, Carmen
    Gomariz-Castillo, Francisco
    COMPUTATIONAL GEOSCIENCES, 2024, 28 (03) : 551 - 571
  • [40] Adaptive OpenMP Task Scheduling Using Runtime APIs and Machine Learning
    Qawasmeh, Ahmad R.
    Malik, Abid M.
    Chapman, Barbara M.
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 889 - 895