ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval

被引：39

作者：

Cui, Quan ^{[1
]}

Jiang, Qing-Yuan ^{[2
]}

Wei, Xiu-Shen ^{[3
]}

Li, Wu-Jun ^{[2
]}

Yoshie, Osamu ^{[1
]}

机构：

[1] Waseda Univ, Grad Sch IPS, Fukuoka, Japan

[2] Nanjing Univ, Dept Comp Sci & Technol, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China

[3] Megvii Technol, Megvii Res Nanjing, Nanjing, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT III | 2020年 / 12348卷

关键词：

Fine-Grained Image Retrieval; Learning to hash; Feature alignment; Large-scale image search; QUANTIZATION;

D O I：

10.1007/978-3-030-58580-8_12

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Retrieving content relevant images from a large-scale fine-grained dataset could suffer from intolerably slow query speed and highly redundant storage cost, due to high-dimensional real-valued embeddings which aim to distinguish subtle visual differences of fine-grained objects. In this paper, we study the novel fine-grained hashing topic to generate compact binary codes for fine-grained images, leveraging the search and storage efficiency of hash learning to alleviate the aforementioned problems. Specifically, we propose a unified end-to-end trainable network, termed as ExchNet. Based on attention mechanisms and proposed attention constraints, ExchNet can firstly obtain both local and global features to represent object parts and the whole fine-grained objects, respectively. Furthermore, to ensure the discriminative ability and semantic meaning's consistency of these part-level features across images, we design a local feature alignment approach by performing a feature exchanging operation. Later, an alternating learning algorithm is employed to optimize the whole ExchNet and then generate the final binary hash codes. Validated by extensive experiments, our ExchNet consistently outperforms state-of-the-art generic hashing methods on five fine-grained datasets. Moreover, compared with other approximate nearest neighbor methods, ExchNet achieves the best speed-up and storage reduction, revealing its efficiency and practicality.

引用

页码：189 / 205

页数：17

共 43 条

[11] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition [J].

Fu, Jianlong ;

Zheng, Heliang ;

Mei, Tao .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4476-4484

[12]

Gong YC, 2011, PROC CVPR IEEE, P817, DOI 10.1109/CVPR.2011.5995432

[13]

Horn G.V., 2015, PROC CVPR IEEE, P595, DOI 10.1109/CVPR.2015.7298658

[14] VegFru: A Domain-Specific Dataset for Fine-grained Visual Categorization [J].

Hou, Saihui ;

Feng, Yushan ;

Wang, Zilei .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :541-549

[15] Product Quantization for Nearest Neighbor Search [J].

Jegou, Herve ;

Douze, Matthijs ;

Schmid, Cordelia .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (01) :117-128

[16]

Jiang QY, 2018, AAAI CONF ARTIF INTE, P3342

[17] Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization [J].

Li, Peihua ;

Xie, Jiangtao ;

Wang, Qilong ;

Gao, Zilin .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :947-955

[18]

Li Q, 2017, ADV NEUR IN, V30

[19]

Li W.-J., 2016, P INT JOINT C ART IN, P1711

[20]

Lin J, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2266

← 1 2 3 4 5 →