An Extended Compression Format for the Optimization of Sparse Matrix-Vector Multiplication

被引:27
作者
Karakasis, Vasileios [1 ]
Gkountouvas, Theodoros [1 ]
Kourtis, Kornilios [2 ]
Goumas, Georgios [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens NTUA, Comp Syst Lab, Sch Elect & Comp Engn, Zografos 15780, Greece
[2] ETH, Syst Grp Adm, CH-8092 Zurich, Switzerland
关键词
Sparse Matrix-Vector Multiplication; multicore optimizations; data compression;
D O I
10.1109/TPDS.2012.290
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sparse matrix-vector multiplication (SpM x V) has been characterized as one of the most significant computational scientific kernels. The key algorithmic characteristic of the SpM x V kernel, that inhibits it from achieving high performance, is its very low flop: byte ratio. In this paper, we present a compressed storage format, called Compressed Sparse eXtended (CSX), that is able to detect and encode simultaneously multiple commonly encountered substructures inside a sparse matrix. Relying on aggressive compression techniques of the sparse matrix's indexing structure, CSX is able to considerably reduce the memory footprint of a sparse matrix, alleviating the pressure to the memory subsystem. In a diverse set of sparse matrices, CSX was able to provide a more than 40 percent average performance improvement over the standard CSR format in SMP architectures and surpassed 20 percent improvement in NUMA systems, significantly outperforming other CSR alternatives. Additionally, it was able to adapt successfully to the nonzero element structure of the considered matrices, exhibiting very stable performance. Finally, in the context of a "real-life" multiphysics simulation software, CSX accelerated the SpM x V component nearly 40 percent and the total solver time approximately 15 percent.
引用
收藏
页码:1930 / 1940
页数:11
相关论文
共 28 条
  • [1] AGARWAL RC, 1992, SUPERCOMPUTING 92 : PROCEEDINGS, P32
  • [2] [Anonymous], 1992, Numerical Methods for Large Eigenvalue Problems
  • [3] [Anonymous], 1994, SPARSKIT BASIC TOOL
  • [4] [Anonymous], P ACM IEEE C SUP
  • [5] Asanovic K., 2006, The landscape of parallel computing research: A view from berkeley
  • [6] Pattern-based Sparse Matrix Representation for Memory-Efficient SMVM Kernels
    Belgin, Mehmet
    Back, Godmar
    Ribbens, Calvin J.
    [J]. ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, : 100 - 109
  • [7] Buluc A., 2011, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), P721, DOI 10.1109/IPDPS.2011.73
  • [8] Buluç A, 2009, SPAA'09: PROCEEDINGS OF THE TWENTY-FIRST ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, P233
  • [9] The University of Florida Sparse Matrix Collection
    Davis, Timothy A.
    Hu, Yifan
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2011, 38 (01):
  • [10] Eun-Jin Im, 2001, Computational Science - ICCS 2001. International Conference. Proceedings, Part I (Lecture Notes in Computer Science Vol.2073), P127