Deep Binary Reconstruction for Cross-Modal Hashing

被引:107
|
作者
Hu, Di [1 ]
Nie, Feiping [1 ]
Li, Xuelong [2 ,3 ,4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian 710072, Shaanxi, Peoples R China
[4] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Xian 710119, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal hashing; binary reconstruction; IMAGE; CODES;
D O I
10.1109/TMM.2018.2866771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To satisfy the huge storage space and organization capacity requirements in addressing big multimodal data, hashing techniques have been widely employed to learn binary representations in cross-modal retrieval tasks. However, optimizing the hashing objective under the necessary binary constraint is truly a difficult problem. A common strategy is to relax the constraint and perform individual binarizations over the learned real-valued representations. In this paper, in contrast to conventional two-stage methods, we propose to directly learn the binary codes, where the model can be easily optimized by a standard gradient descent optimizer. However, before that, we present a theoretical guarantee of the effectiveness of the multimodal network in preserving the inter-and intra-modal consistencies. Based on this guarantee, a novel multimodal deep binary reconstruction model is proposed, which can be trained to simultaneously model the correlation across modalities and learn the binary hashing codes. To generate binary codes and to avoid the tiny gradient problem, a novel activation function first scales the input activations to suitable scopes and, then, feeds them to the tanh function to build the hashing layer. Such a composite function is named adaptive tanh. Both linear and nonlinear scaling methods are proposed and shown to generate efficient codes after training the network. Extensive ablation studies and comparison experiments are conducted for the image2text and text2image retrieval tasks; the method is found to outperform several state-of-the-art deep-learning methods with respect to different evaluation metrics.
引用
收藏
页码:973 / 985
页数:13
相关论文
共 50 条
  • [21] Quadruplet-Based Deep Cross-Modal Hashing
    Liu, Huan
    Xiong, Jiang
    Zhang, Nian
    Liu, Fuming
    Zou, Xitao
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021 (2021)
  • [22] Deep semantics-preserving cross-modal hashing
    Lai, Zhihui
    Fang, Xiaomei
    Kong, Heng
    BIG DATA RESEARCH, 2024, 38
  • [23] Deep Multiscale Fusion Hashing for Cross-Modal Retrieval
    Nie, Xiushan
    Wang, Bowei
    Li, Jiajia
    Hao, Fanchang
    Jian, Muwei
    Yin, Yilong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 401 - 410
  • [24] Dual Deep Neural Networks Cross-Modal Hashing
    Chen, Zhen-Duo
    Yu, Wan-Jin
    Li, Chuan-Xiang
    Nie, Liqiang
    Xu, Xin-Shun
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 274 - 281
  • [25] Regularised Cross-Modal Hashing
    Moran, Sean
    Lavrenko, Victor
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 907 - 910
  • [26] Flexible Cross-Modal Hashing
    Yu, Guoxian
    Liu, Xuanwu
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 304 - 314
  • [27] Discriminant Cross-modal Hashing
    Xu, Xing
    Shen, Fumin
    Yang, Yang
    Shen, Heng Tao
    ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 305 - 308
  • [28] Extensible Cross-Modal Hashing
    Chen, Tian-yi
    Zhang, Lan
    Zhang, Shi-cong
    Li, Zi-long
    Huang, Bai-chuan
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2109 - 2115
  • [29] Cross-Modal Discrete Hashing
    Liong, Venice Erin
    Lu, Jiwen
    Tan, Yap-Peng
    PATTERN RECOGNITION, 2018, 79 : 114 - 129
  • [30] Continuous cross-modal hashing
    Zheng, Hao
    Wang, Jinbao
    Zhen, Xiantong
    Song, Jingkuan
    Zheng, Feng
    Lu, Ke
    Qi, Guo-Jun
    PATTERN RECOGNITION, 2023, 142