Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

被引:111
作者
Hu, Hengtong [1 ,2 ]
Xie, Lingxi [3 ]
Hong, Richang [1 ,2 ]
Tian, Qi [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Hefei, Peoples R China
[3] Huawei Inc, Shenzhen, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.00319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval. There are two main frameworks for CMH, differing from each other in whether semantic supervision is required. Compared to the unsupervised methods, the supervised methods often enjoy more accurate results, but require much heavier labors in data annotation. In this paper, we propose a novel approach that enables guiding a supervised method using outputs produced by an unsupervised method. Specifically, we make use of teacher-student optimization for propagating knowledge. Experiments are performed on two popular CMH benchmarks, i.e., the MIRFlickr and NUS-WIDE datasets. Our approach outperforms all existing unsupervised methods by a large margin.
引用
收藏
页码:3120 / 3129
页数:10
相关论文
共 49 条
[31]   ImageNet Large Scale Visual Recognition Challenge [J].
Russakovsky, Olga ;
Deng, Jia ;
Su, Hao ;
Krause, Jonathan ;
Satheesh, Sanjeev ;
Ma, Sean ;
Huang, Zhiheng ;
Karpathy, Andrej ;
Khosla, Aditya ;
Bernstein, Michael ;
Berg, Alexander C. ;
Fei-Fei, Li .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) :211-252
[32]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[33]  
Song J., 2013, P 2013 ACM SIGMOD IN, P785, DOI [10.1145/2463676.2465274, DOI 10.1145/2463676.2465274]
[34]  
Tarvainen A, 2017, ADV NEUR IN, V30
[35]   Learning Hash Codes with Listwise Supervision [J].
Wang, Jun ;
Liu, Wei ;
Sun, Andy X. ;
Jiang, Yu-Gang .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :3032-3039
[36]   Progressive Teacher-student Learning for Early Action Prediction [J].
Wang, Xionghui ;
Hu, Jian-Fang ;
Lai, Jianhuang ;
Zhang, Jianguo ;
Zheng, Wei-Shi .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3551-3560
[37]   Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval [J].
Wu, Lin ;
Wang, Yang ;
Shao, Ling .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) :1602-1612
[38]   Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval [J].
Wu, Yiling ;
Wang, Shuhui ;
Huang, Qingming .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :825-833
[39]   Online Asymmetric Similarity Learning for Cross-Modal Retrieval [J].
Wu, Yiling ;
Wang, Shuhui ;
Huang, Qingming .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :3984-3993
[40]   Deep adversarial metric learning for cross-modal retrieval [J].
Xu, Xing ;
He, Li ;
Lu, Huimin ;
Gao, Lianli ;
Ji, Yanli .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02) :657-672