Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network

被引：0

作者：

Ma L. ^{[1
]}

Wang Y. ^{[1
,2
]}

机构：

[1] School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai

[2] Shanghai Engineering Research Center of Assistive Devices, Shanghai

来源：

Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence | 2019年 / 32卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Bilinear Convolutional Neural Network(B-CNN); Fine-Grained Visual Recognition; Network Pruning; Network Sparsity; Overfitting;

D O I：

10.16451/j.cnki.issn1003-6059.201904006

中图分类号：

学科分类号：

摘要：

The overfitting problem of bilinear convolutional neural network(B-CNN) for fine-grained visual recognition is caused by the large number of parameters and its complex structure. In this paper, a sparse B-CNN is proposed to handle the problem. Firstly, a scaling factor is introduced into each feature channel of B-CNN, and regularization of sparsity is applied to the scaling factors during the training. Then, the feature channels in B-CNN with low contribution to the final classification are identified by small scaling factors. Finally, these channels are pruned in a certain proportion to prevent overfitting and increase the significance of key features. The learning of sparse B-CNN is weakly supervised and end-to-end. The verification experiments on FGVC-aircraft, Stanford dogs and Stanford cars fine-grained image datasets show that the accuracy of sparse B-CNN is higher than that of the original B-CNN. Moreover, compared with other advanced algorithms for fine-grained visual recognition, the performance of sparse B-CNN is same or even better. 2019, Science Press. All right reserved.

引用

页码：336 / 344

页数：8

共 29 条

[1] Farrell R., Oza O., Zhang N., Et al., Birdlets: Subordinate Categorization Using Volumetric Primitives and Pose-Normalized Appearance, Proc of the International Conference on Computer Vision, pp. 161-168, (2011)
[2] Zhang N., Farrell R., Darrell T., Pose Pooling Kernels for Sub-category Recognition, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3665-3672, (2012)
[3] Luo J.H., Wu J.X., A Survey on Fine-Grained Image Categorization Using Deep Convolutional Features, Acta Automatica Sinica, 43, 8, pp. 1306-1318, (2017)
[4] Cui Y., Song Y., Sun C., Et al., Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning, Proc of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4109-4118, (2018)
[5] Wu L., Wang Y., Li X., Et al., Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition, IEEE Tran-sactions on Cybernetics, 49, 5, pp. 1791-1802, (2019)
[6] Lin T.Y., Roychowdhury A., Maji S., Bilinear CNN Models for Fine-Grained Visual Recognition, Proc of the IEEE International Conference on Computer Vision, pp. 1449-1457, (2015)
[7] Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition
[8] Lecun Y., Denker J.S., Solla S.A., Optimal Brain Damage, Advances in Neural Information Processing Systems 2, pp. 598-605, (1990)
[9] Hinton G.E., Srivastava N., Krizhevsky A., Et al., Improving Neural Networks by Preventing Co-adaptation of Feature Detectors
[10] Quinlan J.R., Bagging, Boosting, and C4.5, Proc of the 13th National Conference on Artificial Intelligence, pp. 725-730, (1996)

← 1 2 3 →