Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition

被引：320

作者：

Wang, Yaming ^{[1
]}

Morariu, Vlad I. ^{[1
,2
]}

Davis, Larry S. ^{[1
]}

机构：

[1] Univ Maryland, College Pk, MD 20742 USA

[2] Adobe Res, San Jose, CA USA

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

D O I：

10.1109/CVPR.2018.00436

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Compared to earlier multistage frameworks using CNN features, recent end-to-end deep approaches for finegrained recognition essentially enhance the mid-level learning capability of CNNs. Previous approaches achieve this by introducing an auxiliary network to infuse localization information into the main classification network, or a sophisticated feature encoding method to capture higher order feature statistics. We show that mid-level representation learning can be enhanced within the CNN framework, by learning a bank of convolutional filters that capture class-specific discriminative patches without extra part or bounding box annotations. Such a filter bank is well structured, properly initialized and discriminatively learned through a novel asymmetric multi-stream architecture with convolutional filter supervision and a non-random layer initialization. Experimental results show that our approach achieves state-of-the-art on three publicly available fine-grained recognition datasets (CUB-200-2011, Stanford Cars and FGVC-Aircraft). Ablation studies and visualizations are provided to understand our approach.

引用

页码：4148 / 4157

页数：10