Unifying Deep Local and Global Features for Image Search

被引:260
作者
Cao, Bingyi [1 ]
Araujo, Andre [1 ]
Sim, Jack [1 ]
机构
[1] Google Res, Mountain View, CA 94039 USA
来源
COMPUTER VISION - ECCV 2020, PT XX | 2020年 / 12365卷
关键词
Deep features; Image retrieval; Unified model; DESCRIPTORS; MODEL;
D O I
10.1007/978-3-030-58565-5_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image retrieval is the problem of searching an image database for items that are similar to a query image. To address this task, two main types of image representations have been studied: global and local image features. In this work, our key contribution is to unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction. We refer to the new model as DELG, standing for DEep Local and Global features. We leverage lessons from recent feature learning work and propose a model that combines generalized mean pooling for global features and attentive selection for local features. The entire network can be learned end-to-end by carefully balancing the gradient flow between two heads - requiring only image-level labels. We also introduce an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance. Comprehensive experiments show that our model achieves state-of-the-art image retrieval on the Revisited Oxford and Paris datasets, and state-of-the-art single-model instance-level recognition on the Google Landmarks dataset v2. Code and models are available at https://github.com/tensorflow/models/tree/master/research/delf.
引用
收藏
页码:726 / 743
页数:18
相关论文
共 63 条
[1]  
[Anonymous], 2006 IEEE COMPUTER S
[2]  
Arandjelovic R., 2016, P CVPR
[3]  
Araujo A., 2019, Distill, DOI DOI 10.23915/DISTILL.00021
[4]   Hough Pyramid Matching: Speeded-Up Geometry Re-ranking for Large Scale Image Retrieval [J].
Avrithis, Yannis ;
Tolias, Giorgos .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 107 (01) :1-19
[5]   Neural Codes for Image Retrieval [J].
Babenko, Artem ;
Slesarev, Anton ;
Chigorin, Alexandr ;
Lempitsky, Victor .
COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 :584-599
[6]   Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters [J].
Barroso-Laguna, Axel ;
Riba, Edgar ;
Ponsa, Daniel ;
Mikolajczyk, Krystian .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5835-5843
[7]   Speeded-Up Robust Features (SURF) [J].
Bay, Herbert ;
Ess, Andreas ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359
[8]   Explore-Exploit Graph Traversal for Image Retrieval [J].
Chang, Cheng ;
Yu, Guangwei ;
Liu, Chundi ;
Volkovs, Maksims .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9415-9423
[9]   Learning a similarity metric discriminatively, with application to face verification [J].
Chopra, S ;
Hadsell, R ;
LeCun, Y .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :539-546
[10]   Total recall: Automatic query expansion with a generative feature model for object retrieval [J].
Chum, Ondrej ;
Philbin, James ;
Sivic, Josef ;
Isard, Michael ;
Zisserman, Andrew .
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :496-+