Kronecker Attention Networks

被引:8
作者
Gao, Hongyang [1 ]
Wang, Zhengyang [1 ]
Ji, Shuiwang [1 ]
机构
[1] Texas A&M Univ, College Stn, TX 77843 USA
来源
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2020年
基金
美国国家科学基金会;
关键词
Attention; neural networks; Kronecker attention; image classification; image segmentation;
D O I
10.1145/3394486.3403065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attention operators have been applied on both 1-D data like texts and higher-order data such as images and videos. Use of attention operators on high-order data requires flattening of the spatial or spatial-temporal dimensions into a vector, which is assumed to follow a multivariate normal distribution. This not only incurs excessive requirements on computational resources, but also fails to preserve structures in data. In this work, we propose to avoid flattening by assuming the data follow matrix-variate normal distributions. Based on this new view, we develop Kronecker attention operators (KAOs) that operate on high-order tensor data directly. More importantly, the proposed KAOs lead to dramatic reductions in computational resources. Experimental results show that our methods reduce the amount of required computational resources by a factor of hundreds, with larger factors for higher-dimensional and higher-order data. Results also show that networks with KAOs outperform models without attention, while achieving competitive performance as those with original attention operators.
引用
收藏
页码:229 / 237
页数:9
相关论文
共 40 条
  • [11] Gupta A. K., 2018, Matrix Variate Distributions
  • [12] Hariharan B, 2011, IEEE I CONF COMP VIS, P991, DOI 10.1109/ICCV.2011.6126343
  • [13] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [14] Howard A. G, 2017, MobileNets: Efficient Convolutional
  • [15] CCNet: Criss-Cross Attention for Semantic Segmentation
    Huang, Zilong
    Wang, Xinggang
    Huang, Lichao
    Huang, Chang
    Wei, Yunchao
    Liu, Wenyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612
  • [16] Iandola F.N., 2016, ARXIV160207360
  • [17] Ioffe S, 2015, PR MACH LEARN RES, V37, P448
  • [18] Ji S., 2018, ADV NEUR IN, P5203
  • [19] Johnson Rie, 2015, Adv Neural Inf Process Syst, V28, P919
  • [20] Kalaitzis A., 2013, PMLR, P1229