Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs

被引:10
|
作者
Abdelfattah, Ahmad [1 ]
Haidar, Azzam [1 ]
Tomov, Stanimire [1 ]
Dongarra, Jack [1 ]
机构
[1] Univ Tennessee, Innovat Comp Lab, Knoxville, TN 37996 USA
关键词
Dense linear solvers; GPU computing; energy efficiency; MATRIX;
D O I
10.1109/TPDS.2018.2842785
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Graphics Processing Units (GPUs) are widely used in accelerating dense linear solvers. The matrix factorizations, which dominate the runtime for these solvers, are often designed using a hybrid scheme, where GPUs perform trailing matrix updates, while the CPUs perform the panel factorizations. Consequently, hybrid solutions require high-end CPUs and optimized CPU software in order to deliver high performance. Furthermore, they lack the energy efficiency inherent for GPUs due to the use of less energy-efficient CPUs, as well as CPU-GPU communications. This paper presents analysis and design techniques that overcome the shortcomings of the hybrid algorithms, and allow the design of high-performance and energy-efficient dense LU and Cholesky factorizations that use GPUs only. The full GPU solution eliminates the need for a high-end CPU and optimized CPU software, which leads to a better energy efficiency. We discuss different design choices, and introduce optimized GPU kernels for panel factorizations. The developed solutions achieve 90+ percent of the performance of optimized hybrid solutions, while improving the energy efficiency by 50 percent. They outperform the vendor library by 30-50 percent in single precision, and 15-50 percent in double precision. We also show that hybrid designs trail the proposed solutions in performance when optimized CPU software is not available.
引用
收藏
页码:2700 / 2712
页数:13
相关论文
共 50 条
  • [1] Design techniques for high-performance, energy-efficient control logic
    Ko, U
    Hill, A
    Balsara, PT
    1996 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN - DIGEST OF TECHNICAL PAPERS, 1996, : 97 - 100
  • [2] Design of High-performance while Energy-efficient Microprocessor with Novel Asynchronous Techniques
    Tang, Xiqin
    Shang, Delong
    2024 IEEE 35TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ASAP 2024, 2024, : 247 - 248
  • [3] Novel CNFET ternary circuit techniques for high-performance and energy-efficient design
    Tabrizchi, Sepehr
    Taheri, MohammadReza
    Navi, Keivan
    Bagherzadeh, Nader
    IET CIRCUITS DEVICES & SYSTEMS, 2019, 13 (02) : 193 - 202
  • [4] Energy-Efficient Design Methodologies: High-Performance VLSI Adders
    Zeydel, Bart R.
    Baran, Dursun
    Oklobdzija, Vojin G.
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2010, 45 (06) : 1220 - 1233
  • [5] Mixing LU and QR factorization algorithms to design high-performance dense linear algebra solvers
    Faverge, Mathieu
    Herrmann, Julien
    Langou, Julien
    Lowery, Bradley
    Robert, Yves
    Dongarra, Jack
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2015, 85 : 32 - 46
  • [6] High-performance, energy-efficient IGBTs
    Snyder, Lucy A.
    Electron Prod Garden City NY, 2008, 8
  • [7] The Design of Fast and Energy-Efficient Linear Solvers: On the Potential of Half-Precision Arithmetic and Iterative Refinement Techniques
    Haidar, Azzam
    Abdelfattah, Ahmad
    Zounon, Mawussi
    Wu, Panruo
    Pranesh, Srikara
    Tomov, Stanimire
    Dongarra, Jack
    COMPUTATIONAL SCIENCE - ICCS 2018, PT I, 2018, 10860 : 586 - 600
  • [8] Thread Batching for High-performance Energy-efficient GPU Memory Design
    Li, Bing
    Mao, Mengjie
    Liu, Xiaoxiao
    Liu, Tao
    Liu, Zihao
    Wen, Wujie
    Chen, Yiran
    Li, Hai
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (04)
  • [9] HIGH-PERFORMANCE SOLVERS FOR DENSE HERMITIAN EIGENPROBLEMS
    Petschow, M.
    Peise, E.
    Bientinesi, P.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2013, 35 (01): : C1 - C22
  • [10] Energy-Efficient and High-Performance Data Converters
    Goes, Joao
    2024 31ST INTERNATIONAL CONFERENCE ON MIXED DESIGN OF INTEGRATED CIRCUITS AND SYSTEM, MIXDES 2024, 2024, : 15 - 15