Elegante: A Machine Learning-Based Threads Configuration Tool for SpMV Computations on Shared Memory Architecture

被引:0
|
作者
Ahmad, Muhammad [1 ]
Sardar, Usman [2 ]
Batyrshin, Ildar [1 ]
Hasnain, Muhammad [3 ]
Sajid, Khan [4 ]
Sidorov, Grigori [1 ]
机构
[1] Inst Politecn Nacl CIC PN, Ctr Invest Comp, Mexico City 07738, Mexico
[2] Inst Arts & Culture, Sch Informat & Robot, Lahore 54000, Pakistan
[3] Lahore Leads Univ, Dept Comp Sci, Lahore 54000, Pakistan
[4] Zhejiang Normal Univ, Coll Comp Sci & Technol, Jinhua 321004, Peoples R China
关键词
CSR; machine learning; SVM; high-performance computing; parallel computing; OpenMPI; shared memory;
D O I
10.3390/info15110685
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The sparse matrix-vector product (SpMV) is a fundamental computational kernel utilized in a diverse range of scientific and engineering applications. It is commonly used to solve linear and partial differential equations. The parallel computation of the SpMV product is a challenging task. Existing solutions often employ a fixed number of threads assignment to rows based on empirical formulas, leading to sub-optimal configurations and significant performance losses. Elegante, our proposed machine learning-powered tool, utilizes a data-driven approach to identify the optimal thread configuration for SpMV computations within a shared memory architecture. It accomplishes this by predicting the best thread configuration based on the unique sparsity pattern of each sparse matrix. Our approach involves training and testing using various base and ensemble machine learning algorithms such as decision tree, random forest, gradient boosting, logistic regression, and support vector machine. We rigorously experimented with a dataset of nearly 1000+ real-world matrices. These matrices originated from 46 distinct application domains, spanning fields like robotics, power networks, 2D/3D meshing, and computational fluid dynamics. Our proposed methodology achieved 62% of the highest achievable performance and is 7.33 times faster, demonstrating a significant disparity from the default OpenMP configuration policy and traditional practice methods of manually or randomly selecting the number of threads. This work is the first attempt where the structure of the matrix is used to predict the optimal thread configuration for the optimization of parallel SpMV computation in a shared memory environment.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] ZAKI plus : A Machine Learning Based Process Mapping Tool for SpMV Computations on Distributed Memory Architectures
    Usman, Sardar
    Mehmood, Rashid
    Katib, Iyad
    Albeshri, Aiiad
    IEEE ACCESS, 2019, 7 : 81279 - 81296
  • [2] AAQAL: A Machine Learning-Based Tool for Performance Optimization of Parallel SPMV Computations Using Block CSR
    Ahmed, Muhammad
    Usman, Sardar
    Shah, Nehad Ali
    Ashraf, M. Usman
    Alghamdi, Ahmed Mohammed
    Bahadded, Adel A.
    Almarhabi, Khalid Ali
    APPLIED SCIENCES-BASEL, 2022, 12 (14):
  • [3] Machine Learning-Based Kernel Selector for SpMV Optimization in Graph Analysis
    Xiao, Guoqing
    Zhou, Tao
    Chen, Yuedan
    Hu, Yikun
    Li, Kenli
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2024, 11 (02)
  • [4] Revisiting thread configuration of SpMV kernels on GPU: A machine learning based approach
    Gao, Jianhua
    Ji, Weixing
    Liu, Jie
    Wang, Yizhuo
    Shi, Feng
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 185
  • [5] Evaluating a Machine Learning-based Approach for Cache Configuration
    Ribeiro, Lucas
    Jacobi, Ricardo
    Junior, Francisco
    da Silva, Jones Yudi
    Silva, Ivan Saraiva
    2022 IEEE 13TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS), 2022, : 180 - 183
  • [6] Machine Learning-Based Configuration Parameter Tuning on Hadoop System
    Chen, Chi-Ou
    Zhuo, Ye-Qi
    Yeh, Chao-Chun
    Lin, Che-Min
    Liao, Shih-wei
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 386 - 392
  • [7] A machine learning-based universal outbreak risk prediction tool
    Zhang, Tianyu
    Rabhi, Fethi
    Chen, Xin
    Paik, Hye-young
    Macintyre, Chandini Raina
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [8] WorMachine: machine learning-based phenotypic analysis tool for worms
    Hakim, Adam
    Mor, Yael
    Toker, Itai Antoine
    Levine, Amir
    Neuhof, Moran
    Markovitz, Yishai
    Rechavi, Oded
    BMC BIOLOGY, 2018, 16
  • [9] WorMachine: machine learning-based phenotypic analysis tool for worms
    Adam Hakim
    Yael Mor
    Itai Antoine Toker
    Amir Levine
    Moran Neuhof
    Yishai Markovitz
    Oded Rechavi
    BMC Biology, 16
  • [10] Machine Learning-Based Prefetching for SCM Main Memory System
    Koezuka, Mayuko
    Shirota, Yusuke
    Shirai, Satoshi
    Kanai, Tatsunori
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2020), 2020, : 769 - 776