Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval

被引:9
作者
Meng, Min [1 ]
Sun, Jiaxuan [1 ]
Liu, Jigang [2 ]
Yu, Jun [1 ]
Wu, Jigang [1 ]
机构
[1] Guangdong Univ Technol, Sch Comp Sci, Guangzhou 510006, Peoples R China
[2] Ping An Life Insurance China, Shenzhen 518046, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; hashing; adversarial learning; disentangled representation; REPRESENTATION; NETWORK;
D O I
10.1109/TCSVT.2023.3293104
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cross-modal hashing has gained considerable attention in cross-modal retrieval due to its low storage cost and prominent computational efficiency. However, preserving more semantic information in the compact hash codes to bridge the modality gap still remains challenging. Most existing methods unconsciously neglect the influence of modality-private information on semantic embedding discrimination, leading to unsatisfactory retrieval performance. In this paper, we propose a novel deep cross-modal hashing method, called Semantic Disentanglement Adversarial Hashing (SDAH), to tackle these challenges for cross-modal retrieval. Specifically, SDAH is designed to decouple the original features of each modality into modality-common features with semantic information and modality-private features with disturbing information. After the preliminary decoupling, the modality-private features are shuffled and treated as positive interactions to enhance the learning of modality-common features, which can significantly boost the discriminative and robustness of semantic embeddings. Moreover, the variational information bottleneck is introduced in the hash feature learning process, which can avoid the loss of a large amount of semantic information caused by the high-dimensional feature compression. Finally, the discriminative and compact hash codes can be computed directly from the hash features. A large number of comparative and ablation experiments show that SDAH achieves superior performance than other state-of-the-art methods.
引用
收藏
页码:1914 / 1926
页数:13
相关论文
共 53 条
  • [11] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
  • [12] Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval
    Gu, Wen
    Gu, Xiaoyan
    Gu, Jingzi
    Li, Bo
    Xiong, Zhi
    Wang, Weiping
    [J]. ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 159 - 167
  • [13] Han YX, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P715
  • [14] Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing
    Hu, Hengtong
    Xie, Lingxi
    Hong, Richang
    Tian, Qi
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3120 - 3129
  • [15] Learning Disentangled Representation for Multi-View 3D Object Recognition
    Huang, Jingjia
    Yan, Wei
    Li, Ge
    Li, Thomas
    Liu, Shan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 646 - 659
  • [16] Huiskes M.J., 2008, MIR 08, P39, DOI 10.1145/1460096
  • [17] Deep Cross-Modal Hashing
    Jiang, Qing-Yuan
    Li, Wu-Jun
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3270 - 3278
  • [18] Kacem Anis, 2022, 2022 8th International Conference on Virtual Reality (ICVR), P438, DOI 10.1109/ICVR55215.2022.9848415
  • [19] Kang WC, 2016, AAAI CONF ARTIF INTE, P1230
  • [20] Kingma D. P., 2015, arXiv