A survey of fine-grained visual categorization based on deep learning

被引:0
作者
Xie Yuxiang [1 ]
Gong Quanzhi [1 ]
Luan Xidao [2 ]
Yan Jie [1 ]
Zhang Jiahui [1 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha 410000, Peoples R China
[2] Changsha Univ, Coll Comp Engn & Appl Math, Changsha 410003, Peoples R China
基金
中国国家自然科学基金;
关键词
deep learning; fine-grained visual categorization; convolutional neural network (CNN); visual attention; ATTENTION; NETWORK;
D O I
10.23919/JSEE.2022.000155
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning has achieved excellent results in various tasks in the field of computer vision, especially in fine-grained visual categorization. It aims to distinguish the subordinate categories of the label-level categories. Due to high intra-class variances and high inter-class similarity, the fine-grained visual categorization is extremely challenging. This paper first briefly introduces and analyzes the related public datasets. After that, some of the latest methods are reviewed. Based on the feature types, the feature processing methods, and the overall structure used in the model, we divide them into three types of methods: methods based on general convolutional neural network (CNN) and strong supervision of parts, methods based on single feature processing, and methods based on multiple feature processing. Most methods of the first type have a relatively simple structure, which is the result of the initial research. The methods of the other two types include models that have special structures and training processes, which are helpful to obtain discriminative features. We conduct a specific analysis on several methods with high accuracy on public datasets. In addition, we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power. In terms of technology, the extraction of the subtle feature information with the burgeoning vision transformer (ViT) network is also an important research direction.
引用
收藏
页数:20
相关论文
共 95 条
  • [1] Beery S, 2021, Arxiv, DOI arXiv:2105.03494
  • [2] Beery S, 2019, Arxiv, DOI arXiv:1904.05986
  • [3] Behera A, 2021, Arxiv, DOI arXiv:2101.06635
  • [4] Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934, 10.48550/arXiv.2004.10934]
  • [5] The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification
    Chang, Dongliang
    Ding, Yifeng
    Xie, Jiyang
    Bhunia, Ayan Kumar
    Li, Xiaoxu
    Ma, Zhanyu
    Wu, Ming
    Guo, Jun
    Song, Yi-Zhe
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4683 - 4695
  • [6] Selective Sparse Sampling for Fine-grained Image Recognition
    Ding, Yao
    Zhou, Yanzhao
    Zhu, Yi
    Ye, Qixiang
    Jiao, Jianbin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6598 - 6607
  • [7] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [8] Fan Zhang, 2021, MultiMedia Modeling. 27th International Conference, MMM 2021. Proceedings. Lecture Notes in Computer Science (LNCS 12572), P136, DOI 10.1007/978-3-030-67832-6_12
  • [9] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
    Fu, Jianlong
    Zheng, Heliang
    Mei, Tao
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
  • [10] Gao Y, 2020, AAAI CONF ARTIF INTE, V34, P10818