Learning Multifunctional Binary Codes for Personalized Image Retrieval

被引：3

作者：

Liu, Haomiao ^{[1
,2
,3
]}

Wang, Ruiping ^{[1
,2
]}

Shan, Shiguang ^{[1
,2
]}

Chen, Xilin ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Huawei EI Innovat Lab, Beijing 100085, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2020年 / 128卷 / 8-9期

关键词：

Image retrieval; Multi-task learning; Hashing; REPRESENTATION; NETWORK;

D O I：

10.1007/s11263-020-01315-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the highly complex semantic information of images, even with the same query image, the expected content-based image retrieval results could be very different and personalized in different scenarios. However, most existing hashing methods only preserve one single type of semantic similarity, making them incapable of addressing such realistic retrieval tasks. To deal with this problem, we propose a unified hashing framework to encode multiple types of information into the binary codes by exploiting convolutional networks (CNNs). Specifically, we assume that typical retrieval tasks are generally defined in two aspects, i.e. high-level semantics (e.g. object categories) and visual attributes (e.g. object shape and color). To this end, our Dual Purpose Hashing model is trained to jointly preserve two kinds of similarities characterizing the two aspects respectively. Moreover, since images with both category and attribute labels are scarce, our model is carefully designed to leverage the abundant partially labelled data as training inputs to alleviate the risk of overfitting. With such a framework, the binary codes of new-coming images can be readily obtained by quantizing the outputs of a specific CNN layer, and different retrieval tasks can be achieved by using the binary codes in different ways. Experiments on two large-scale datasets show that our method achieves comparable or even better performance than those state-of-the-art methods specifically designed for each individual retrieval task while being more compact than the compared methods.

引用

页码：2223 / 2242

页数：20

共 67 条

[1] ALHALAH Z, 2018, ARXIV181200202
[2] [Anonymous], 2017, P AAAI C ART INT, DOI [DOI 10.1609/AAAI.V31I1.10719, 10.1609/aaai.v31i1.10719]
[3] [Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.559
[4] Network Dissection: Quantifying Interpretability of Deep Visual Representations
Bau, David
Zhou, Bolei
Khosla, Aditya
Oliva, Aude
Torralba, Antonio
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3319 - 3327
[5] Cakir Fatih, 2018, P EUROPEAN C COMPUTE, P332
[6] Partially Shared Multi-Task Convolutional Neural Network with Local Constraint for Face Attribute Learning
Cao, Jiajiong
Li, Yingming
Zhang, Zhongfei
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4290 - 4299
[7] Deep Cauchy Hashing for Hamming Space Retrieval
Cao, Yue
Long, Mingsheng
Liu, Bin
Wang, Jianmin
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1229 - 1237
[8] HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
Cao, Yue
Liu, Bin
Long, Mingsheng
Wang, Jianmin
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1287 - 1296
[9] HashNet: Deep Learning to Hash by Continuation
Cao, Zhangjie
Long, Mingsheng
Wang, Jianmin
Yu, Philip S.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5609 - 5618
[10] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
Deng, Cheng
Chen, Zhaojia
Liu, Xianglong
Gao, Xinbo
Tao, Dacheng
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903

← 1 2 3 4 5 6 7 →