Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

被引:103
作者
Hu, Hengtong [1 ,2 ]
Xie, Lingxi [3 ]
Hong, Richang [1 ,2 ]
Tian, Qi [3 ]
机构
[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei, Peoples R China
[2] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Hefei, Peoples R China
[3] Huawei Inc, Shenzhen, Peoples R China
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR42600.2020.00319
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval. There are two main frameworks for CMH, differing from each other in whether semantic supervision is required. Compared to the unsupervised methods, the supervised methods often enjoy more accurate results, but require much heavier labors in data annotation. In this paper, we propose a novel approach that enables guiding a supervised method using outputs produced by an unsupervised method. Specifically, we make use of teacher-student optimization for propagating knowledge. Experiments are performed on two popular CMH benchmarks, i.e., the MIRFlickr and NUS-WIDE datasets. Our approach outperforms all existing unsupervised methods by a large margin.
引用
收藏
页码:3120 / 3129
页数:10
相关论文
共 49 条
  • [1] [Anonymous], 2018, P 35 INT C MACHINE L
  • [2] Bronstein MM, 2010, PROC CVPR IEEE, P3594, DOI 10.1109/CVPR.2010.5539928
  • [3] Exploring Object Relation in Mean Teacher for Cross-Domain Detection
    Cai, Qi
    Pan, Yingwei
    Ngo, Chong-Wah
    Tian, Xinmei
    Duan, Lingyu
    Yao, Ting
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 11449 - 11458
  • [4] Deep Clustering for Unsupervised Learning of Visual Features
    Caron, Mathilde
    Bojanowski, Piotr
    Joulin, Armand
    Douze, Matthijs
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 139 - 156
  • [5] The devil is in the details: an evaluation of recent feature encoding methods
    Chatfield, Ken
    Lempitsky, Victor
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [6] Chen YT, 2018, AAAI CONF ARTIF INTE, P2852
  • [7] Chua T.-S., 2009, P ACM INT C IM VID R, P48
  • [8] Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
    Deng, Cheng
    Chen, Zhaojia
    Liu, Xianglong
    Gao, Xinbo
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (08) : 3893 - 3903
  • [9] Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    Gao, Yue
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5427 - 5440
  • [10] Collective Matrix Factorization Hashing for Multimodal Data
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2083 - 2090