A Lightweight Transformer with Convolutional Attention

被引：0

作者：

Zeng, Kungan ^{[1
]}

Paik, Incheon ^{[1
]}

机构：

[1] Univ Aizu, Sch Comp Sci & Engn, Fukushima, Japan

来源：

2020 11TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST) | 2020年

关键词：

neural machine translation; Transformer; CNN; Muti-head attention;

D O I：

10.1109/ICAST51195.2020.9319489

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Neural machine translation (NMT) goes through rapid development because of the application of various deep learning techs. Especially, how to construct a more effective structure of NMT attracts more and more attention. Transformer is a state-of-the-art architecture in NMT. It replies on the self-attention mechanism exactly instead of recurrent neural networks (RNN). The Multi-head attention is a crucial part that implements the self-attention mechanism, and it also dramatically affects the scale of the model. In this paper, we present a new Multi-head attention by combining convolution operation. In comparison with the base Transformer, our approach can reduce the number of parameters effectively. And we perform a reasoned experiment. The result shows that the performance of the new model is similar to the base model.

引用

页数：6

共 50 条

[41] Lightweight Steganography Detection Method Based on Multiple Residual Structures and Transformer
Li, Hao
Zhang, Yi
Wang, Jinwei
Zhang, Weiming
Luo, Xiangyang
CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (04) : 965 - 978
[42] Lightweight monocular depth estimation using a fusion-improved transformer
Sui, Xin
Gao, Song
Xu, Aigong
Zhang, Cong
Wang, Changqiang
Shi, Zhengxu
SCIENTIFIC REPORTS, 2024, 14 (01):
[43] Kernelized Convolutional and Transformer Based Hierarchical Spatio-temporal Attention Network for Autonomous Vehicle Trajectory Prediction
Sharma, Omveer
Sahoo, N. C.
Puhan, N. B.
INTERNATIONAL JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH, 2025,
[44] CFormerFaceNet: Efficient Lightweight Network Merging a CNN and Transformer for Face Recognition
He, Lin
He, Lile
Peng, Lijun
APPLIED SCIENCES-BASEL, 2023, 13 (11):
[45] DATransformer: Deep Attention Transformer
Matsumoto, Riku
Kimura, Masaomi
PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2024, 2024, : 371 - 375
[46] Relaxed Attention for Transformer Models
Lohrenz, Timo
Moeller, Bjoern
Li, Zhengyang
Fingscheidt, Tim
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[47] A multi-scale enhanced large-kernel attention transformer network for lightweight image super-resolution
Chang, Kairong
Jun, Sun
Biao, Yang
Hu, Mingzhi
Yang, Junlong
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (03)
[48] SVDFormer: lightweight transformer based on singular value decompositionSVDFormer: lightweight transformer...X. Ren et al.
Xiangyu Ren
Jian Chen
Fanlong Zhang
The Journal of Supercomputing, 81 (8)
[49] Progressive convolutional transformer for image restoration
Wan, Yecong
Shao, Mingwen
Cheng, Yuanshuo
Meng, Deyu
Zuo, Wangmeng
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 125
[50] A Lightweight Object Tracking Method Based on Transformer
Sun, Ziwen
Yang, Chuandong
Ling, Chong
PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 796 - 801

← 1 2 3 4 5 →