The BiConjugate gradient method on GPUs

被引:12
作者
Ortega, G. [1 ]
Garzon, E. M. [1 ]
Vazquez, F. [1 ]
Garcia, I. [2 ]
机构
[1] Univ Almeria, Dpt Comput Archit & Electron, Almeria 04120, Spain
[2] Univ Malaga, Dpt Comput Archit, E-29071 Malaga, Spain
关键词
BiConjugate gradient method; GPU computing; Parallel computing; Linear system of equations;
D O I
10.1007/s11227-012-0761-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In a wide variety of applications from different scientific and engineering fields, the solution of complex and/or nonsymmetric linear systems of equations is required. To solve this kind of linear systems the BiConjugate Gradient method (BCG) is especially relevant. Nevertheless, BCG has a enormous computational cost. GPU computing is useful for accelerating this kind of algorithms but it is necessary to develop suitable implementations to optimally exploit the GPU architecture. In this paper, we show how BCG can be effectively accelerated when all operations are computed on a GPU. So, BCG has been implemented with two alternative routines of the Sparse Matrix Vector product (SpMV): the CUSPARSE library and the ELLR-T routine. Although our interest is focused on complex matrices, our implementation has been evaluated on a GPU for two sets of test matrices: complex and real, in single and double precision data. Experimental results show that BCG based on ELLR-T routine achieves the best performance, particularly for the set of complex test matrices. Consequently, this method can be useful as a tool to efficiently solve large linear system of equations (complex and/or nonsymmetric) involved in a broad range of applications.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 50 条
  • [41] The Use of GPUs for Solving the Computed Tomography Problem
    Kovtanyuk, A. E.
    JOURNAL OF NANO- AND ELECTRONIC PHYSICS, 2014, 6 (03)
  • [42] StreamMR: An Optimized MapReduce Framework for AMD GPUs
    Elteir, Marwa
    Lin, Heshan
    Feng, Wu-chun
    Scogland, Tom
    2011 IEEE 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2011, : 364 - 371
  • [43] Brook for GPUs: Stream computing on graphics hardware
    Buck, I
    Foley, T
    Horn, D
    Sugerman, J
    Fatahalian, K
    Houston, M
    Hanrahan, P
    ACM TRANSACTIONS ON GRAPHICS, 2004, 23 (03): : 777 - 786
  • [44] Image Re-Ranking Acceleration on GPUs
    Guimaraes Pedronette, Daniel Carlos
    Torres, Ricardo da S.
    Borin, Edson
    Breternitz, Mauricio
    2013 25TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2013, : 176 - 183
  • [45] Power Consumption Analysis of Parallel Algorithms on GPUs
    Magoules, Frederic
    Ahamed, Abal-Kassim Cheik
    Desmaison, Alban
    Lechenet, Jean-Christophe
    Mayer, Francois
    Ben Salem, Haifa
    Zhu, Thomas
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 304 - 311
  • [46] Parallelizing Alternating Direction Implicit Solver on GPUs
    Wei, Zhangping
    Jang, Byunghyun
    Zhang, Yaoxin
    Jia, Yafei
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 389 - 398
  • [47] Accelerating algebraic multigrid solvers on NVIDIA GPUs
    Liu, Hui
    Yang, Bo
    Chen, Zhangxin
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2015, 70 (05) : 1162 - 1181
  • [48] GPUrpc: Exploring Transparent Access to Remote GPUs
    Iida, Yuki
    Fujii, Yusuke
    Azumi, Takuya
    Nishio, Nobuhiko
    Kato, Shinpei
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2016, 16 (01)
  • [49] Relational Learning with GPUs: Accelerating Rule Coverage
    Alberto Martinez-Angeles, Carlos
    Wu, Haicheng
    Dutra, Ines
    Costa, Vitor Santos
    Buenabad-Chavez, Jorge
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2016, 44 (03) : 663 - 685
  • [50] Relational Learning with GPUs: Accelerating Rule Coverage
    Carlos Alberto Martínez-Angeles
    Haicheng Wu
    Inês Dutra
    Vítor Santos Costa
    Jorge Buenabad-Chávez
    International Journal of Parallel Programming, 2016, 44 : 663 - 685