Deep code operation network for multi-label image retrieval

被引:6
作者
Song, Ge [1 ,2 ,3 ]
Tan, Xiaoyang [1 ,2 ,3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
[2] MIIT Key Lab Pattern Anal & Machine Intelligence, Nanjing, Jiangsu, Peoples R China
[3] Collaborat Innovat Ctr Novel Software Technol & I, Nanjing 211106, Peoples R China
基金
美国国家科学基金会;
关键词
Multi-label image retrieval; Hashing; Deep learning;
D O I
10.1016/j.cviu.2020.102916
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep hashing methods have been extensively studied for large-scale image search and achieved promising results in recent years. However, there are two major limitations of previous deep hashing methods for multilabel image retrieval: the first one concerns the flexibility for users to express their query intention (so-called the intention gap), and the second one concerns the exploitation of rich similarity structures of the semantic space (so-called the semantic gap). To address these issues, we propose a novel Deep Code Operation Network (CoNet), in which a user is allowed to simultaneously present multiple images instead of a single one as his/her query, and then the system triggers a series of code operators to extract the hidden relations among them. In this way, a set of new queries are automatically constructed to cover users' real complex query intention, without the need of explicitly stating them. The CoNet is trained with a newly proposed margin-adaptive triplet loss function, which effectively encourages the system to incorporate the hierarchical similarity structures of the semantic space into the learning procedure of the code operations. The whole system has an end-to-end differentiable architecture, equipped with an adversarial mechanism to further improve the quality of the final intention representation. Experimental results on four multi-label image datasets demonstrate that our method significantly improves the state-of-the-art in performing complex multi-label retrieval tasks with multiple query images.
引用
收藏
页数:13
相关论文
共 55 条
[1]  
[Anonymous], 2009, ICIVR
[2]   Deep Progressive Hashing for Image Retrieval [J].
Bai, Jiale ;
Ni, Bingbing ;
Wang, Minsi ;
Shen, Yang ;
Lai, Hanjiang ;
Zhang, Chongyang ;
Mei, Lin ;
Hu, Chuanping ;
Yao, Chen .
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :208-216
[3]   MIHash: Online Hashing with Mutual Information [J].
Cakir, Fatih ;
He, Kun ;
Bargal, Sarah Adel ;
Sclaroff, Stan .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :437-445
[4]   HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN [J].
Cao, Yue ;
Liu, Bin ;
Long, Mingsheng ;
Wang, Jianmin .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1287-1296
[5]   HashNet: Deep Learning to Hash by Continuation [J].
Cao, Zhangjie ;
Long, Mingsheng ;
Wang, Jianmin ;
Yu, Philip S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5609-5618
[6]   Siamese graph convolutional network for content based remote sensing image retrieval [J].
Chaudhuri, Ushasi ;
Banerjee, Biplab ;
Bhattacharya, Avik .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 184 :22-30
[7]   Query-Free Clothing Retrieval via Implicit Relevance Feedback [J].
Chen, Zhuoxiang ;
Xu, Zhe ;
Zhang, Ya ;
Gu, Xiao .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (08) :2126-2137
[8]   Learning Deep Binary Descriptor with Multi-Quantization [J].
Duan, Yueqi ;
Lu, Jiwen ;
Wang, Ziwei ;
Feng, Jianjiang ;
Zhou, Jie .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4857-4866
[9]  
Everingham M., 2007, International journal of computer vision, DOI DOI 10.1007/s11263-009-0275-4
[10]  
Ganin Y, 2016, J MACH LEARN RES, V17