Bilinear Residual Attention Networks for Fine-Grained Image Classification

被引:6
作者
Wang Yang [1 ]
Liu Libo [1 ]
机构
[1] Ningxia Univ, Sch Informat Engn, Yinchuan 750021, Ningxia, Peoples R China
关键词
image processing; fine-grained image classification; attention mechanism; residual network; channel attention; spatial attention;
D O I
10.3788/LOP57.121011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Fine-grained images have a highly similar appearance, and the differences arc often reflected in local regions. Extracting discriminative local features plays a key role in fine-grained classification. Attention mechanism is a common strategy to solve the problems above. Therefore, we propose an improved bilinear residual attention network based on bilinear convolutional neural network model in this paper: the feature function of the original model is replaced by deep residual network with a stronger feature extraction capability, then channel attention module and spatial attention module arc added between the residual units respectively to obtain different dimensions and richer attention features. Ablation and contrast experiments were performed on three fine-grained image datasets CUB-200-2011, Stanford Dogs, and Stanford Cars, the classification accuracy of the improved model reached 87.2%, 89.2% and 92.5%, respectively. Experimental results show that our method can achieve better classification results than the original model and other mainstream fine-grained classification algorithms.
引用
收藏
页数:10
相关论文
共 27 条
[1]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[2]  
Chatfield K, 2014, P BRIT MACH VIS C 20
[3]   Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].
Fu, Jianlong ;
Zheng, Heliang ;
Mei, Tao .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484
[4]  
Glorot X., 2010, P INT C ART INT STAT, P249
[5]  
He K, 2016, PROC CVPR IEEE, P770, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[6]   Part-Stacked CNN for Fine-Grained Visual Categorization [J].
Huang, Shaoli ;
Xu, Zhe ;
Tao, Dacheng ;
Zhang, Ya .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1173-1182
[7]  
JADERBERG M, 2015, ADV NEURAL INFORM PR, P2017, DOI DOI 10.1145/2948076.2948084
[8]  
Khosla A, 2011, PROCEEDINGS OF THE 1, V2
[9]   An Introduction to Variational Autoencoders [J].
Kingma, Diederik P. ;
Welling, Max .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (04) :4-89
[10]   Tensor Decompositions and Applications [J].
Kolda, Tamara G. ;
Bader, Brett W. .
SIAM REVIEW, 2009, 51 (03) :455-500