A Lightweight Transformer with Convolutional Attention

被引:0
作者
Zeng, Kungan [1 ]
Paik, Incheon [1 ]
机构
[1] Univ Aizu, Sch Comp Sci & Engn, Fukushima, Japan
来源
2020 11TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST) | 2020年
关键词
neural machine translation; Transformer; CNN; Muti-head attention;
D O I
10.1109/ICAST51195.2020.9319489
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Neural machine translation (NMT) goes through rapid development because of the application of various deep learning techs. Especially, how to construct a more effective structure of NMT attracts more and more attention. Transformer is a state-of-the-art architecture in NMT. It replies on the self-attention mechanism exactly instead of recurrent neural networks (RNN). The Multi-head attention is a crucial part that implements the self-attention mechanism, and it also dramatically affects the scale of the model. In this paper, we present a new Multi-head attention by combining convolution operation. In comparison with the base Transformer, our approach can reduce the number of parameters effectively. And we perform a reasoned experiment. The result shows that the performance of the new model is similar to the base model.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Lightweight Steganography Detection Method Based on Multiple Residual Structures and Transformer
    Li, Hao
    Zhang, Yi
    Wang, Jinwei
    Zhang, Weiming
    Luo, Xiangyang
    CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (04) : 965 - 978
  • [42] Lightweight monocular depth estimation using a fusion-improved transformer
    Sui, Xin
    Gao, Song
    Xu, Aigong
    Zhang, Cong
    Wang, Changqiang
    Shi, Zhengxu
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [43] Kernelized Convolutional and Transformer Based Hierarchical Spatio-temporal Attention Network for Autonomous Vehicle Trajectory Prediction
    Sharma, Omveer
    Sahoo, N. C.
    Puhan, N. B.
    INTERNATIONAL JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH, 2025,
  • [44] CFormerFaceNet: Efficient Lightweight Network Merging a CNN and Transformer for Face Recognition
    He, Lin
    He, Lile
    Peng, Lijun
    APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [45] DATransformer: Deep Attention Transformer
    Matsumoto, Riku
    Kimura, Masaomi
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2024, 2024, : 371 - 375
  • [46] Relaxed Attention for Transformer Models
    Lohrenz, Timo
    Moeller, Bjoern
    Li, Zhengyang
    Fingscheidt, Tim
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [47] A multi-scale enhanced large-kernel attention transformer network for lightweight image super-resolution
    Chang, Kairong
    Jun, Sun
    Biao, Yang
    Hu, Mingzhi
    Yang, Junlong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (03)
  • [48] SVDFormer: lightweight transformer based on singular value decompositionSVDFormer: lightweight transformer...X. Ren et al.
    Xiangyu Ren
    Jian Chen
    Fanlong Zhang
    The Journal of Supercomputing, 81 (8)
  • [49] Progressive convolutional transformer for image restoration
    Wan, Yecong
    Shao, Mingwen
    Cheng, Yuanshuo
    Meng, Deyu
    Zuo, Wangmeng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 125
  • [50] A Lightweight Object Tracking Method Based on Transformer
    Sun, Ziwen
    Yang, Chuandong
    Ling, Chong
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 796 - 801