Adaptive Label-Aware Graph Convolutional Networks for Cross-Modal Retrieval

被引:26
|
作者
Qian, Shengsheng [1 ,2 ]
Xue, Dizhan [1 ,2 ]
Fang, Quan [1 ,2 ]
Xu, Changsheng [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Correlation; Semantics; Task analysis; Adaptation models; Adaptive systems; Birds; Oceans; Cross-modal retrieval; Deep learning; Graph convolutional networks;
D O I
10.1109/TMM.2021.3101642
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The cross-modal retrieval task has raised continuous attention in recent years with the increasing scale of multi-modal data, which has broad application prospects including multimedia data management and intelligent search engine. Most existing methods mainly project data of different modalities into a common representation space where label information is often exploited to distinguish samples from different semantic categories. However, they typically treat each label as an independent individual and ignore the underlying semantic structure of labels. In this paper, we propose an end-to-end adaptive label-aware graph convolutional network (ALGCN) by designing both the instance representation learning branch and the label representation learning branch, which can obtain modality-invariant and discriminative representations for cross-modal retrieval. Firstly, we construct an instance representation learning branch to transform instances of different modalities into a common representation space. Secondly, we adopt Graph Convolutional Network (GCN) to learn inter-dependent classifiers in the label representation learning branch. In addition, a novel adaptive correlation matrix is proposed to efficiently explore and preserve the semantic structure of labels in a data-driven manner. Together with a robust self-supervision loss for GCN, the GCN model can be supervised to learn an effective and robust correlation matrix for feature propagation. Comprehensive experimental results on three benchmark datasets, NUS-WIDE, MIRFlickr and MS-COCO, demonstrate the superiority of ALGCN, compared with the state-of-the-art methods in cross-modal retrieval.
引用
收藏
页码:3520 / 3532
页数:13
相关论文
共 50 条
  • [21] Multimodal Graph Learning for Cross-Modal Retrieval
    Xie, Jingyou
    Zhao, Zishuo
    Lin, Zhenzhou
    Shen, Ying
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 145 - 153
  • [22] ACM: Adaptive Cross-Modal Graph Convolutional Neural Networks for RGB-D Scene Recognition
    Yuan, Yuan
    Xiong, Zhitong
    Wang, Qi
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9176 - 9184
  • [23] Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks
    Wu, Fei
    Li, Shuaishuai
    Gao, Guangwei
    Ji, Yimu
    Jing, Xiao-Yuan
    Wan, Zhiguo
    PATTERN RECOGNITION, 2023, 136
  • [24] Adaptive Label Correlation Based Asymmetric Discrete Hashing for Cross-Modal Retrieval
    Li, Huaxiong
    Zhang, Chao
    Jia, Xiuyi
    Gao, Yang
    Chen, Chunlin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1185 - 1199
  • [25] Adaptive multi-label structure preserving network for cross-modal retrieval
    Zhu, Jie
    Zhang, Hui
    Chen, Junfen
    Xie, Bojun
    Liu, Jianan
    Zhang, Junsan
    INFORMATION SCIENCES, 2024, 682
  • [26] Integrating Multi-Label Contrastive Learning With Dual Adversarial Graph Neural Networks for Cross-Modal Retrieval
    Qian, Shengsheng
    Xue, Dizhan
    Fang, Quan
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4794 - 4811
  • [27] Non-Co-Occurrence Enhanced Multi-Label Cross-Modal Hashing Retrieval Based on Graph Convolutional Network
    Li, Mingyong
    Fan, Jiabao
    Lin, Ziyong
    IEEE ACCESS, 2023, 11 : 16310 - 16322
  • [28] Label Guided Discrete Hashing for Cross-Modal Retrieval
    Lan, Rushi
    Tan, Yu
    Wang, Xiaoqin
    Liu, Zhenbing
    Luo, Xiaonan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25236 - 25248
  • [29] Label Embedding Online Hashing for Cross-Modal Retrieval
    Wang, Yongxin
    Luo, Xin
    Xu, Xin-Shun
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 871 - 879
  • [30] Label Distribution Guided Hashing for Cross-Modal Retrieval
    Lei, Fatang
    Zhang, Chao
    Li, Huaxiong
    Gao, Yang
    Chen, Chunlin
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2025, 19 (01)