Selective Sparse Sampling for Fine-grained Image Recognition

被引:201
作者
Ding, Yao [1 ,2 ]
Zhou, Yanzhao [1 ]
Zhu, Yi [1 ]
Ye, Qixiang [1 ,2 ]
Jiao, Jianbin [1 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
关键词
D O I
10.1109/ICCV.2019.00670
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained recognition poses the unique challenge of capturing subtle inter-class differences under considerable intra-class variances (e.g., beaks for bird species). Conventional approaches crop local regions and learn detailed representation from those regions, but suffer from the fixed number of parts and missing of surrounding context. In this paper, we propose a simple yet effective framework, called Selective Sparse Sampling, to capture diverse and fine-grained details. The framework is implemented using Convolutional Neural Networks, referred to as Selective Sparse Sampling Networks (S3Ns). With image-level supervision, S3Ns collect peaks, i.e., local maximums, from class response maps to estimate informative receptive fields and learn a set of sparse attention for capturing fine-detailed visual evidence as well as preserving context. The evidence is selectively sampled to extract discriminative and complementary features, which significantly enrich the learned representation and guide the network to discover more subtle cues. Extensive experiments and ablation studies show that the proposed method consistently outperforms the state-of-the-art methods on challenging benchmarks including CUB-200-2011, FGVC-Aircraft, and Stanford Cars(1).
引用
收藏
页码:6598 / 6607
页数:10
相关论文
共 35 条
[1]   Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization [J].
Cai, Sijia ;
Zuo, Wangmeng ;
Zhang, Lei .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :511-520
[2]  
Chen Tianshui, 2018, ARXOV180804505
[3]   Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].
Fu, Jianlong ;
Zheng, Heliang ;
Mei, Tao .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484
[4]   Compact Bilinear Pooling [J].
Gao, Yang ;
Beijbom, Oscar ;
Zhang, Ning ;
Darrell, Trevor .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :317-326
[5]  
He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[6]   Part-Stacked CNN for Fine-Grained Visual Categorization [J].
Huang, Shaoli ;
Xu, Zhe ;
Tao, Dacheng ;
Zhang, Ya .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1173-1182
[7]  
Jaderberg M., 2015, Neural Inf. Process. Syst., V28, P2017, DOI DOI 10.48550/ARXIV.1506.02025
[8]   Visual Foraging With Fingers and Eye Gaze [J].
Johannesson, Omar I. ;
Thornton, Ian M. ;
Smith, Irene J. ;
Chetverikov, Andrey ;
Kristjansson, Arni .
I-PERCEPTION, 2016, 7 (02) :1-18
[9]   Low-rank Bilinear Pooling for Fine-Grained Classification [J].
Kong, Shu ;
Fowlkes, Charless .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7025-7034
[10]  
Krause J, 2013, USING OLD SOLUTIONS TO NEW PROBLEMS - NATURAL DRUG DISCOVERY IN THE 21ST CENTURY, P3, DOI 10.5772/56424