Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

被引:111
作者
Hu, Hengtong [1 ,2 ]
Xie, Lingxi [3 ]
Hong, Richang [1 ,2 ]
Tian, Qi [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Hefei, Peoples R China
[3] Huawei Inc, Shenzhen, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.00319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval. There are two main frameworks for CMH, differing from each other in whether semantic supervision is required. Compared to the unsupervised methods, the supervised methods often enjoy more accurate results, but require much heavier labors in data annotation. In this paper, we propose a novel approach that enables guiding a supervised method using outputs produced by an unsupervised method. Specifically, we make use of teacher-student optimization for propagating knowledge. Experiments are performed on two popular CMH benchmarks, i.e., the MIRFlickr and NUS-WIDE datasets. Our approach outperforms all existing unsupervised methods by a large margin.
引用
收藏
页码:3120 / 3129
页数:10
相关论文
共 49 条
[1]  
[Anonymous], 2016, INT C LEARN REPR
[2]  
[Anonymous], INT C MACHINE LEARNI
[3]  
Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
[4]   Exploring Object Relation in Mean Teacher for Cross-Domain Detection [J].
Cai, Qi ;
Pan, Yingwei ;
Ngo, Chong-Wah ;
Tian, Xinmei ;
Duan, Lingyu ;
Yao, Ting .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :11449-11458
[5]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156
[6]   The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[7]  
Chen YT, 2018, AAAI CONF ARTIF INTE, P2852
[8]  
Chua T. -S., 2009, P ACM INT C IM VID R, V1, P9
[9]   Triplet-Based Deep Hashing Network for Cross-Modal Retrieval [J].
Deng, Cheng ;
Chen, Zhaojia ;
Liu, Xianglong ;
Gao, Xinbo ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) :3893-3903
[10]   Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing [J].
Ding, Guiguang ;
Guo, Yuchen ;
Zhou, Jile ;
Gao, Yue .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) :5427-5440