Large-Scale Image Retrieval with Deep Attentive Global Features

被引：9

作者：

Zhu, Yingying ^{[1
]}

Wang, Yinghao ^{[1
]}

Chen, Haonan ^{[1
]}

Guo, Zemian ^{[1
]}

Huang, Qiang ^{[1
]}

机构：

[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Nanhai Ave 3688, Shenzhen 518060, Guangdong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS | 2023年 / 33卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Image retrieval; attention mechanism; convolutional neural network;

D O I：

10.1142/S0129065723500132

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

How to obtain discriminative features has proved to be a core problem for image retrieval. Many recent works use convolutional neural networks to extract features. However, clutter and occlusion will interfere with the distinguishability of features when using convolutional neural network (CNN) for feature extraction. To address this problem, we intend to obtain high-response activations in the feature map based on the attention mechanism. We propose two attention modules, a spatial attention module and a channel attention module. For the spatial attention module, we first capture the global information and model the relation between channels as a region evaluator, which evaluates and assigns new weights to local features. For the channel attention module, we use a vector with trainable parameters to weight the importance of each feature map. The two attention modules are cascaded to adjust the weight distribution for the feature map, which makes the extracted features more discriminative. Furthermore, we present a scale and mask scheme to scale the major components and filter out the meaningless local features. This scheme can reduce the disadvantages of the various scales of the major components in images by applying multiple scale filters, and filter out the redundant features with the MAX-Mask. Exhaustive experiments demonstrate that the two attention modules are complementary to improve performance, and our network with the three modules outperforms the state-of-the-art methods on four well-known image retrieval datasets.

引用

页数：18

共 48 条

[1]

[Anonymous], 2015, P C NEUR INF PROC SY

[2]

Arandjelovic R, 2018, IEEE T PATTERN ANAL, V40, P1437, DOI [10.1109/TPAMI.2017.2711011, 10.1109/CVPR.2016.572]

[3] Aggregating Deep Convolutional Features for Image Retrieval [J].

Babenko, Artem ;

Lempitsky, Victor .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1269-1277

[4] Neural Codes for Image Retrieval [J].

Babenko, Artem ;

Slesarev, Anton ;

Chigorin, Alexandr ;

Lempitsky, Victor .

COMPUTER VISION - ECCV 2014, PT I, 2014, 8689 :584-599

[5] Rethinking Visual Geo-localization for Large-Scale Applications [J].

Berton, Gabriele ;

Masone, Carlo ;

Caputo, Barbara .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4868-4878

[6]

Berton Gabriele., 2022, P IEEECVF C COMPUTER, P5396

[7]

Bres S., 1999, Visual Information and Information Systems. Third International Conference, VISUAL'99. Proceedings (Lecture Notes in Computer Science Vol.1614), P427

[8] Total recall: Automatic query expansion with a generative feature model for object retrieval [J].

Chum, Ondrej ;

Philbin, James ;

Sivic, Josef ;

Isard, Michael ;

Zisserman, Andrew .

2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, :496-+

[9] Image retrieval by elastic matching of shapes and image patterns [J].

DelBimbo, A ;

Pala, P ;

Santini, S .

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, 1996, :215-218

[10]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 →