Differentiable Patch Selection for Image Recognition

被引：54

作者：

Cordonnier, Jean-Baptiste ^{[1
]}

Mahendran, Aravindh ^{[2
]}

Dosovitskiy, Alexey ^{[2
]}

Weissenborn, Dirk ^{[2
]}

Uszkoreit, Jakob ^{[2
]}

Unterthiner, Thomas ^{[2
]}

机构：

[1] Ecole Polytech Fed Lausanne, Lausanne, Switzerland

[2] Google Res, Brain Team, Mountain View, CA USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

D O I：

10.1109/CVPR46437.2021.00238

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. We propose a method based on a differentiable Top-K operator to select the most relevant parts of the input to efficiently process high resolution images. Our method may be interfaced with any downstream neural network, is able to aggregate information from different patches in a flexible way, and allows the whole model to be trained end-to-end using backpropagation. We show results for traffic sign recognition, inter-patch relationship reasoning, and fine-grained recognition without using object/part bounding box annotations during training.

引用

页码：2351 / 2360

页数：10

共 23 条

[1]

Abernethy J, 2016, NEURAL INF PROCESS S, P233

[2] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J].

Anderson, Peter ;

He, Xiaodong ;

Buehler, Chris ;

Teney, Damien ;

Johnson, Mark ;

Gould, Stephen ;

Zhang, Lei .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6077-6086

[3] Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization [J].

Cai, Sijia ;

Zuo, Wangmeng ;

Zhang, Lei .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :511-520

[4] Attention to Scale: Scale-aware Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Yang, Yi ;

Wang, Jiang ;

Xu, Wei ;

Yuille, Alan L. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3640-3649

[5]

Cordonnier J., 2020, IBER CONF INF SYST, DOI DOI 10.23919/cisti49556.2020.9141108

[6]

Elsayed G. F., 2019, NeurIPS

[7] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].

Fu, Jianlong ;

Zheng, Heliang ;

Mei, Tao .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484

[8]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[9]

Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]

[10]

Larsson F, 2011, LECT NOTES COMPUT SC, V6688, P238, DOI 10.1007/978-3-642-21227-7_23

← 1 2 3 →