ExTensor: An Accelerator for Sparse Tensor Algebra

被引:160
作者
Hegde, Kartik [1 ]
Asghari-Moghaddam, Hadi [1 ]
Pellauer, Michael [2 ]
Crago, Neal [2 ]
Jaleel, Aamer [2 ]
Solomonik, Edgar [1 ]
Emer, Joel S. [2 ,3 ]
Fletcher, Christopher W. [1 ]
机构
[1] Univ Illinois, Champaign, IL 61820 USA
[2] NVIDIA, Santa Clara, CA USA
[3] MIT, Cambridge, MA 02139 USA
来源
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE | 2019年
关键词
Tensor Algebra; Sparse Computation; Hardware Acceleration;
D O I
10.1145/3352460.3358275
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Generalized tensor algebra is a prime candidate for acceleration via customized ASICs. Modern tensors feature a wide range of data sparsity, with the density of non-zero elements ranging from 10-6% to 50%. This paper proposes a novel approach to accelerate tensor kernels based on the principle of hierarchical elimination of computation in the presence of sparsity. This approach relies on rapidly inding intersections-Dsituations where both operands of a multiplication are non-zero-Denabling new data fetching mechanisms and avoiding memory latency overheads associated with sparse kernels implemented in software. We propose the ExTensor accelerator, which builds these novel ideas on handling sparsity into hardware to enable better bandwidth utilization and compute throughput. We evaluate ExTensor on several kernels relative to industry libraries (Intel MKL) and state-of-the-art tensor algebra compilers (TACO). When bandwidth normalized, we demonstrate an average speedup of 3.4x, 1.3x, 2.8x, 24.9x, and 2.7x on SpMSpM, SpMM, TTV, TTM, and SDDMM kernels respectively over a server class CPU.
引用
收藏
页码:319 / 333
页数:15
相关论文
共 51 条
[1]  
[Anonymous], ASPLOS 19
[2]  
[Anonymous], 2016, ISCA
[3]  
[Anonymous], 2016, ISCA 16
[4]  
[Anonymous], 2018, IEEE INT J ARXIV
[5]  
[Anonymous], CACHE OPTIMISED DATA
[6]  
[Anonymous], 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC)
[7]  
[Anonymous], ISCA 17
[8]  
[Anonymous], SIGARCH COMPUTER ARC
[9]  
[Anonymous], 2009, SIAM REV
[10]  
[Anonymous], CVPR 16