Transductive Zero-Shot Hashing for Multilabel Image Retrieval

被引:7
作者
Zou, Qin [1 ]
Cao, Ling [1 ]
Zhang, Zheng [1 ]
Chen, Long [2 ]
Wang, Song [3 ,4 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 518001, Peoples R China
[3] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29201 USA
[4] Tianjin Univ, Sch Comp Sci & Technol, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Image retrieval; Training; Hash functions; Data models; Visualization; Quantization (signal); Deep hashing; image retrieval; multilabel image; transductive learning; zero-shot learning; CODES;
D O I
10.1109/TNNLS.2020.3043298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hash coding has been widely used in the approximate nearest neighbor search for large-scale image retrieval. Given semantic annotations such as class labels and pairwise similarities of the training data, hashing methods can learn and generate effective and compact binary codes. While some newly introduced images may contain undefined semantic labels, which we call unseen images, zero-shot hashing (ZSH) techniques have been studied for retrieval. However, existing ZSH methods mainly focus on the retrieval of single-label images and cannot handle multilabel ones. In this article, for the first time, a novel transductive ZSH method is proposed for multilabel unseen image retrieval. In order to predict the labels of the unseen/target data, a visual-semantic bridge is built via instance-concept coherence ranking on the seen/source data. Then, pairwise similarity loss and focal quantization loss are constructed for training a hashing model using both the seen/source and unseen/target data. Extensive evaluations on three popular multilabel data sets demonstrate that the proposed hashing method achieves significantly better results than the comparison methods.
引用
收藏
页码:1673 / 1687
页数:15
相关论文
共 84 条
[1]  
Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911
[2]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[3]  
Cao Y, 2011, PROC CVPR IEEE, P17, DOI 10.1109/CVPR.2011.5995590
[4]   Deep Cauchy Hashing for Hamming Space Retrieval [J].
Cao, Yue ;
Long, Mingsheng ;
Liu, Bin ;
Wang, Jianmin .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1229-1237
[5]   HashNet: Deep Learning to Hash by Continuation [J].
Cao, Zhangjie ;
Long, Mingsheng ;
Wang, Jianmin ;
Yu, Philip S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5609-5618
[6]   Surrounding Vehicle Detection Using an FPGA Panoramic Camera and Deep CNNs [J].
Chen, Long ;
Zou, Qin ;
Pan, Ziyu ;
Lai, Danyu ;
Zhu, Liwei ;
Hou, Zhoufan ;
Wang, Jun ;
Cao, Dongpu .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (12) :5110-5122
[7]   Deep Integration: A Multi-Label Architecture for Road Scene Recognition [J].
Chen, Long ;
Zhan, Wujing ;
Tian, Wei ;
He, Yuhang ;
Zou, Qin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) :4883-4898
[8]  
Chua T.S., 2009, P ACM INT C IM VID R
[9]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[10]   Compact Hash Codes for Efficient Visual Descriptors Retrieval in Large Scale Databases [J].
Ercoli, Simone ;
Bertini, Marco ;
Del Bimbo, Alberto .
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (11) :2521-2532