Fast algorithm for parallel solving inversion of large scale small matrices based on GPU

被引:2
|
作者
Jin, Xuebin [1 ,2 ,3 ]
Chen, Yewang [1 ,2 ,3 ]
Fan, Wentao [1 ,2 ,3 ]
Zhang, Yong [4 ]
Du, Jixiang [1 ,2 ]
机构
[1] Huaqiao Univ, Coll Comp Sci & Technol, Jimei Avenu 668, Xiamen 361021, Fujian, Peoples R China
[2] Huaqiao Univ, Fujian Key Lab Big Data Intelligence & Secur, Jimei Avenu 668, Xiamen 361021, Fujian, Peoples R China
[3] Huaqiao Univ, Xiamen Key Lab Comp Vis & Pattern Recognit, Jimei Avenu 668, Xiamen 361021, Fujian, Peoples R China
[4] Huaqiao Univ, Coll Mech Engn & Automat, Jimei Avenu 668, Xiamen 361021, Fujian, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 16期
基金
中国国家自然科学基金;
关键词
GPU acceleration; Matrix inversion; A large number of small matrices; High performance computing; CUDA;
D O I
10.1007/s11227-023-05336-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Inverting a matrix is time-consuming, and many works focus on accelerating the inversion of a single large matrix by GPU. However, the problem of parallelizing the inversion of a large number of small matrices has received little attention. These problems are widely applied in computer science, including accelerating cryptographic algorithms and image processing algorithms. In this paper, we propose a Revised In-Place Inversion algorithm for inverting a large number of small matrices on the CUDA platform, which adopts a more refined parallelization scheme and outperforms other algorithms, achieving a speedup of up to 20.9572 times over the batch matrix inverse kernel in CUBLAS. Additionally, we found that there is an upper bound on the input data size for each GPU device, and the performance will degrade if the input data size is too large. Therefore, we propose the Saturation Size Curve based on this finding to divide matrices into batches and improve the algorithm performance. Experimental results show that this strategy increases the algorithm's performance by 1.75 times and effectively alleviates the problem of performance degradation.
引用
收藏
页码:18313 / 18339
页数:27
相关论文
共 50 条
  • [1] Fast algorithm for parallel solving inversion of large scale small matrices based on GPU
    Jin Xuebin
    Chen Yewang
    Fan Wentao
    Zhang Yong
    Du Jixiang
    The Journal of Supercomputing, 2023, 79 : 18313 - 18339
  • [2] GPU-accelerated sparse matrices parallel inversion algorithm for large-scale power systems
    Zhou, Gan
    Feng, Yanjun
    Bo, Rui
    Zhang, Tao
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2019, 111 : 34 - 43
  • [3] A parallel improved IWO algorithm on GPU for solving large scale global optimization problems
    Ouyang, Aijia
    Peng, Xuyu
    Wang, Qian
    Wang, Ya
    Tung Khac Truong
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 1041 - 1051
  • [4] Solving a large scale radiosity problem on GPU-based parallel computers
    D'Azevedo, Eduardo
    Hu, Zhiang
    Su, Shi-Quan
    Wong, Kwai
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2014, 270 : 109 - 120
  • [5] Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing
    Liu, Qiang
    Qin, Yi
    Li, Guodong
    WATER, 2018, 10 (05):
  • [6] Regularized focusing inversion for large-scale gravity data based on GPU parallel computing
    WANG Haoran
    DING Yidan
    LI Feida
    LI Jing
    GlobalGeology, 2019, 22 (03) : 179 - 187
  • [7] Fast parallel algorithm of triangle intersection based on GPU
    Wang, Zheng
    Ren, Gaojun
    Zhao, Liangeng
    Sun, Meijun
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 371 - 374
  • [8] Fast parallel algorithm of triangle intersection based on GPU
    Wang, Zheng
    Ren, Gaojun
    Zhao, Liangeng
    Sun, Meijun
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 548 - 554
  • [9] Parallel Fast Pencil Drawing Generation Algorithm Based on GPU
    Qiu, Jiyan
    Liu, Bin
    He, Jinrong
    Liu, Chaoyang
    Li, Yuancheng
    IEEE ACCESS, 2019, 7 : 83543 - 83555
  • [10] Parallel genetic algorithm based on GPU for solving quadratic assignment problem
    Mohammadi, Javad
    Mirzaie, Kamal
    Derhami, Val I.
    2015 2ND INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED ENGINEERING AND INNOVATION (KBEI), 2015, : 568 - 571