Improved ResNet Image Classification Model Based on Tensor Synthesis Attention

被引:1
作者
Qiu Yunfei [1 ]
Zhang Jiaxin [1 ]
Lan Hai [2 ]
Zong Jiaxu [3 ]
机构
[1] Liaoning Tech Univ, Coll Software, Huludao 125105, Liaoning, Peoples R China
[2] Chinese Acad Sci, Quanzhou Inst Equipment Mfg Haixi Inst, Quanzhou 362216, Fujian, Peoples R China
[3] Yuanqi Ind Technol Co, Qingdao 266000, Shandong, Peoples R China
关键词
tensor synthesis attention; residual network; self-attention; feature extraction; image classification;
D O I
10.3788/LOP212836
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An improved ResNet- 101 network model that fuses tensor synthesis attention ( RTSA Net- 101) is proposed to solve insufficient feature extraction and the indiscriminate contribution of the extracted features when processing image classification tasks using a convolutional neural network. First, the image features are extracted using a Resnet-101 backbone network and the tensor synthesis attention module is embedded after the convolution structure of the residual network. The features are calculated using a three-tensor product to obtain the attention feature matrix. Next, the Softmax function is used to normalize the attention feature matrix to assign weights to features and distinguish the contribution of features. Finally, the weighted sum of the weights and critical values are calculated as the final features in our proposed method to improve the image classification performance. Comparative experiments are conducted on natural image datasets, CIFAR-10 and CIFAR-100, and street brand dataset, SVHN. The classification accuracy values of the models are 96. 12%, 81. 60%, and 96. 67%, respectively, and the average test running time of the images are 0. 0258 s, 0. 0260 s, and 0. 0262 s, respectively. The experimental results show that compared with the other seven advanced image classification models, the RTSA Net-101 model can achieve higher classification accuracy and shorter test run time, and it can effectively enhance the feature learning ability of the network, thereby render the proposed model innovative and efficient.
引用
收藏
页数:10
相关论文
共 30 条
  • [1] [常东良 Chang Dongliang], 2021, [图学学报, Journal of Graphics], V42, P32
  • [2] [陈琳琳 Chen Linlin], 2020, [南京理工大学学报. 自然科学版, Journal of Nanjing University of Science and Technology], V44, P669
  • [3] Cordonnier JB, 2020, Arxiv, DOI arXiv:1911.03584
  • [4] A multilinear singular value decomposition
    De Lathauwer, L
    De Moor, B
    Vandewalle, J
    [J]. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2000, 21 (04) : 1253 - 1278
  • [5] Dupont E., 2019, arXiv
  • [6] With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations
    Dwibedi, Debidatta
    Aytar, Yusuf
    Tompson, Jonathan
    Sermanet, Pierre
    Zisserman, Andrew
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9568 - 9577
  • [7] [付晓 Fu Xiao], 2020, [自动化学报, Acta Automatica Sinica], V46, P531
  • [8] Effects of porosity and area density on upward flame spread characteristics over thin flax fabric
    Gao, Yunji
    Zhu, Hui
    Zhang, Yuchun
    Zhu, Guoqing
    Chai, Guoqiang
    [J]. TEXTILE RESEARCH JOURNAL, 2021, 91 (5-6) : 681 - 690
  • [9] [郭玉荣 Guo Yurong], 2020, [中国图象图形学报, Journal of Image and Graphics], V25, P486
  • [10] Hassani A, 2022, Arxiv, DOI [arXiv:2104.05704, DOI 10.48550/ARXIV.2104.05704]