A novel deep translated attention hashing for cross-modal retrieval

被引：0

作者：

Haibo Yu

Ran Ma

Min Su

Ping An

Kai Li

机构：

[1] Shanghai University,School of Communication and Information Engineering

[2] Shanghai University,undefined

[3] Shanghai Institute for Advanced Communication and Data Science,undefined

来源：

Multimedia Tools and Applications | 2022年 / 81卷

关键词：

Cross-modal retrieval; Deep hashing; Attention; Multi-modal data;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In recent years, driven by the increasing number of cross-modal data such as images and texts, cross-modal retrieval has received intensive attention. Great progress has made in deep cross-modal hash retrieval, which integrates feature leaning and hash learning into an end-to-end trainable framework to obtain the better hash codes. However, due to the heterogeneity between images and texts, it is still a challenge to compare the similarity between them. Most previous approaches embed images and texts into a joint embedding subspace independently and then compare their similarity, which ignore the influence of irrelevant regions (regions in images without the corresponding textual description) on cross-modal retrieval and the fine-grained interactions between images and texts. To address these issues, a new cross-modal hashing called Deep Translated Attention Hashing for Cross-Modal Retrieval (DTAH) is proposed. Firstly, DTAH extracts image and text features through the bottom-up attention and the recurrent neural network respectively to reduce the influence of irrelevant regions on cross-modal retrieval. Then, with the help of cross-modal attention module, DTAH captures the fine-grained interactions between vision and language at region level and word level, and then embeds the text features into the image feature space. In this way, the proposed DTAH effectively shrinks the heterogeneity between images and texts, and can learn the discriminative hash codes. Extensive experiments on three benchmark data sets demonstrate that DTAH surpasses the state-of-the-art methods.

引用

页码：26443 / 26461

页数：18

共 50 条

[1] A novel deep translated attention hashing for cross-modal retrieval
Yu, Haibo
Ma, Ran
Su, Min
An, Ping
Li, Kai
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (18) : 26443 - 26461
[2] Deep semantic hashing with dual attention for cross-modal retrieval
Jiagao Wu
Weiwei Weng
Junxia Fu
Linfeng Liu
Bin Hu
Neural Computing and Applications, 2022, 34 : 5397 - 5416
[3] Deep semantic hashing with dual attention for cross-modal retrieval
Wu, Jiagao
Weng, Weiwei
Fu, Junxia
Liu, Linfeng
Hu, Bin
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5397 - 5416
[4] Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval
Zhang, Xi
Lai, Hanjiang
Feng, Jiashi
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 614 - 629
[5] Deep medical cross-modal attention hashing
Zhang, Yong
Ou, Weihua
Shi, Yufeng
Deng, Jiaxin
You, Xinge
Wang, Anzhi
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1519 - 1536
[6] Deep medical cross-modal attention hashing
Yong Zhang
Weihua Ou
Yufeng Shi
Jiaxin Deng
Xinge You
Anzhi Wang
World Wide Web, 2022, 25 : 1519 - 1536
[7] Multi-attention based semantic deep hashing for cross-modal retrieval
Zhu, Liping
Tian, Gangyi
Wang, Bingyao
Wang, Wenjie
Zhang, Di
Li, Chengyang
APPLIED INTELLIGENCE, 2021, 51 (08) : 5927 - 5939
[8] Multi-attention based semantic deep hashing for cross-modal retrieval
Liping Zhu
Gangyi Tian
Bingyao Wang
Wenjie Wang
Di Zhang
Chengyang Li
Applied Intelligence, 2021, 51 : 5927 - 5939
[9] Deep Hashing Similarity Learning for Cross-Modal Retrieval
Ma, Ying
Wang, Meng
Lu, Guangyun
Sun, Yajun
IEEE ACCESS, 2024, 12 : 8609 - 8618
[10] Supervised Hierarchical Deep Hashing for Cross-Modal Retrieval
Zhan, Yu-Wei
Luo, Xin
Wang, Yongxin
Xu, Xin-Shun
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3386 - 3394

← 1 2 3 4 5 →