Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network

被引:0
|
作者
Ma L. [1 ]
Wang Y. [1 ,2 ]
机构
[1] School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai
[2] Shanghai Engineering Research Center of Assistive Devices, Shanghai
来源
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence | 2019年 / 32卷 / 04期
基金
中国国家自然科学基金;
关键词
Bilinear Convolutional Neural Network(B-CNN); Fine-Grained Visual Recognition; Network Pruning; Network Sparsity; Overfitting;
D O I
10.16451/j.cnki.issn1003-6059.201904006
中图分类号
学科分类号
摘要
The overfitting problem of bilinear convolutional neural network(B-CNN) for fine-grained visual recognition is caused by the large number of parameters and its complex structure. In this paper, a sparse B-CNN is proposed to handle the problem. Firstly, a scaling factor is introduced into each feature channel of B-CNN, and regularization of sparsity is applied to the scaling factors during the training. Then, the feature channels in B-CNN with low contribution to the final classification are identified by small scaling factors. Finally, these channels are pruned in a certain proportion to prevent overfitting and increase the significance of key features. The learning of sparse B-CNN is weakly supervised and end-to-end. The verification experiments on FGVC-aircraft, Stanford dogs and Stanford cars fine-grained image datasets show that the accuracy of sparse B-CNN is higher than that of the original B-CNN. Moreover, compared with other advanced algorithms for fine-grained visual recognition, the performance of sparse B-CNN is same or even better. 2019, Science Press. All right reserved.
引用
收藏
页码:336 / 344
页数:8
相关论文
共 29 条
  • [1] Farrell R., Oza O., Zhang N., Et al., Birdlets: Subordinate Categorization Using Volumetric Primitives and Pose-Normalized Appearance, Proc of the International Conference on Computer Vision, pp. 161-168, (2011)
  • [2] Zhang N., Farrell R., Darrell T., Pose Pooling Kernels for Sub-category Recognition, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3665-3672, (2012)
  • [3] Luo J.H., Wu J.X., A Survey on Fine-Grained Image Categorization Using Deep Convolutional Features, Acta Automatica Sinica, 43, 8, pp. 1306-1318, (2017)
  • [4] Cui Y., Song Y., Sun C., Et al., Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4109-4118, (2018)
  • [5] Wu L., Wang Y., Li X., Et al., Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition, IEEE Tran-sactions on Cybernetics, 49, 5, pp. 1791-1802, (2019)
  • [6] Lin T.Y., Roychowdhury A., Maji S., Bilinear CNN Models for Fine-Grained Visual Recognition, Proc of the IEEE International Conference on Computer Vision, pp. 1449-1457, (2015)
  • [7] Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition
  • [8] Lecun Y., Denker J.S., Solla S.A., Optimal Brain Damage, Advances in Neural Information Processing Systems 2, pp. 598-605, (1990)
  • [9] Hinton G.E., Srivastava N., Krizhevsky A., Et al., Improving Neural Networks by Preventing Co-adaptation of Feature Detectors
  • [10] Quinlan J.R., Bagging, Boosting, and C4.5, Proc of the 13th National Conference on Artificial Intelligence, pp. 725-730, (1996)