Deep Binary Reconstruction for Cross-Modal Hashing

被引：107

作者：

Hu, Di ^{[1
]}

Nie, Feiping ^{[1
]}

Li, Xuelong ^{[2
,3
,4
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710072, Shaanxi, Peoples R China

[2] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China

[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian 710072, Shaanxi, Peoples R China

[4] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Xian 710119, Shaanxi, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2019年 / 21卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Cross-modal hashing; binary reconstruction; IMAGE; CODES;

D O I：

10.1109/TMM.2018.2866771

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

To satisfy the huge storage space and organization capacity requirements in addressing big multimodal data, hashing techniques have been widely employed to learn binary representations in cross-modal retrieval tasks. However, optimizing the hashing objective under the necessary binary constraint is truly a difficult problem. A common strategy is to relax the constraint and perform individual binarizations over the learned real-valued representations. In this paper, in contrast to conventional two-stage methods, we propose to directly learn the binary codes, where the model can be easily optimized by a standard gradient descent optimizer. However, before that, we present a theoretical guarantee of the effectiveness of the multimodal network in preserving the inter-and intra-modal consistencies. Based on this guarantee, a novel multimodal deep binary reconstruction model is proposed, which can be trained to simultaneously model the correlation across modalities and learn the binary hashing codes. To generate binary codes and to avoid the tiny gradient problem, a novel activation function first scales the input activations to suitable scopes and, then, feeds them to the tanh function to build the hashing layer. Such a composite function is named adaptive tanh. Both linear and nonlinear scaling methods are proposed and shown to generate efficient codes after training the network. Extensive ablation studies and comparison experiments are conducted for the image2text and text2image retrieval tasks; the method is found to outperform several state-of-the-art deep-learning methods with respect to different evaluation metrics.

引用

页码：973 / 985

页数：13

共 50 条

[21] Quadruplet-Based Deep Cross-Modal Hashing
Liu, Huan
Xiong, Jiang
Zhang, Nian
Liu, Fuming
Zou, Xitao
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021 (2021)
[22] Deep semantics-preserving cross-modal hashing
Lai, Zhihui
Fang, Xiaomei
Kong, Heng
BIG DATA RESEARCH, 2024, 38
[23] Deep Multiscale Fusion Hashing for Cross-Modal Retrieval
Nie, Xiushan
Wang, Bowei
Li, Jiajia
Hao, Fanchang
Jian, Muwei
Yin, Yilong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 401 - 410
[24] Dual Deep Neural Networks Cross-Modal Hashing
Chen, Zhen-Duo
Yu, Wan-Jin
Li, Chuan-Xiang
Nie, Liqiang
Xu, Xin-Shun
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 274 - 281
[25] Regularised Cross-Modal Hashing
Moran, Sean
Lavrenko, Victor
SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 907 - 910
[26] Flexible Cross-Modal Hashing
Yu, Guoxian
Liu, Xuanwu
Wang, Jun
Domeniconi, Carlotta
Zhang, Xiangliang
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) : 304 - 314
[27] Discriminant Cross-modal Hashing
Xu, Xing
Shen, Fumin
Yang, Yang
Shen, Heng Tao
ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 305 - 308
[28] Extensible Cross-Modal Hashing
Chen, Tian-yi
Zhang, Lan
Zhang, Shi-cong
Li, Zi-long
Huang, Bai-chuan
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2109 - 2115
[29] Cross-Modal Discrete Hashing
Liong, Venice Erin
Lu, Jiwen
Tan, Yap-Peng
PATTERN RECOGNITION, 2018, 79 : 114 - 129
[30] Continuous cross-modal hashing
Zheng, Hao
Wang, Jinbao
Zhen, Xiantong
Song, Jingkuan
Zheng, Feng
Lu, Ke
Qi, Guo-Jun
PATTERN RECOGNITION, 2023, 142

← 1 2 3 4 5 →