Semi-supervised constrained graph convolutional network for cross-modal retrieval

被引:6
|
作者
Zhang, Lei [1 ,2 ]
Chen, Leiting [1 ,2 ,3 ]
Ou, Weihua [4 ]
Zhou, Chuan [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Digital Media Technol Key Lab Sichuan Prov, Chengdu, Peoples R China
[3] Inst Elect & Informat Engn UESTC Guangdong, Dongguan, Peoples R China
[4] Guizhou Normal Univ, Sch Big Data & Comp Sci, Guiyang, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal retrieval; Semi-supervised learning; Deep neural network; Graph convolutional network; IMAGES;
D O I
10.1016/j.compeleceng.2022.107994
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Exploiting relationship among samples in cross-modal data plays a key role in the task of cross modal retrieval, but most of existing methods only extract the correlation from pairwise samples and ignore the relations of unpaired samples. Some graph regularization methods proposed a reasonable paradigm to exploit the correlation from multiple samples. However, limited by the traditional framework, the performance has much room to improve. Moreover, although some existing DNN-based methods achieve excellent performance, the requirement of massive labeled data is also a shortcoming. In this paper, we propose a novel semi-supervised method, named Semi-supervised Constrained Graph Convolutional Network (SCGCN), which adopts graph convolutional network to exploit correlation from batch samples of data with different modalities. For reducing the requirement of labeled data, we design a two stage training procedure: deep supervised learning stage and unsupervised learning stage. In deep supervised learning stage, we integrate two DNN-based semantic encoding networks and a shared classifier into Deep Cross-modal Semantic Encoding (DCSE) module which is trained by supervised learning with labeled data. From DCSE module, we learn a temporary modality-invariant space where the semantic embeddings of samples with different modalities are modality-invariant, and we also learn a classifier which can generate predicted label from the unlabeled data. In unsupervised learning stage, for fully exploiting the correlation from cross-modal data, we design a Constrained Graph Convolutional Network (CGCN) module which utilizes GCN to exploit the correlation and adopts both intra-modal discriminative loss and inter-modal pairwise similar loss to ensure the generated common representation modality-invariant and semantical discriminative. We perform extensive experiments on four conventional datasets and a large scale dataset to demonstrate the effectiveness of proposed approach.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] SEMI-SUPERVISED GRAPH CONVOLUTIONAL HASHING NETWORK FOR LARGE-SCALE CROSS-MODAL RETRIEVAL
    Shen, Zhanjian
    Zhai, Deming
    Liu, Xianming
    Jiang, Junjun
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2366 - 2370
  • [2] Semi-supervised Cross-Modal Hashing with Graph Convolutional Networks
    Duan, Jiasheng
    Luo, Yadan
    Wang, Ziwei
    Huang, Zi
    DATABASES THEORY AND APPLICATIONS, ADC 2020, 2020, 12008 : 93 - 104
  • [3] Semi-supervised cross-modal retrieval with graph-based semantic alignment network
    Zhang, Lei
    Chen, Leiting
    Ou, Weihua
    Zhou, Chuan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 102
  • [4] Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph Regularization
    Xu, Gongwen
    Li, Xiaomei
    Zhang, Zhijun
    IEEE ACCESS, 2020, 8 : 14278 - 14288
  • [5] A semi-supervised cross-modal memory bank for cross-modal retrieval
    Huang, Yingying
    Hu, Bingliang
    Zhang, Yipeng
    Gao, Chi
    Wang, Quan
    NEUROCOMPUTING, 2024, 579
  • [6] Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks
    Wu, Fei
    Li, Shuaishuai
    Gao, Guangwei
    Ji, Yimu
    Jing, Xiao-Yuan
    Wan, Zhiguo
    PATTERN RECOGNITION, 2023, 136
  • [7] Semi-Supervised Cross-Modal Retrieval With Label Prediction
    Mandal, Devraj
    Rao, Pramod
    Biswas, Soma
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (09) : 2345 - 2353
  • [8] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
    Fuhao Zou
    Xingqiang Bai
    Chaoyang Luan
    Kai Li
    Yunfei Wang
    Hefei Ling
    World Wide Web, 2019, 22 : 825 - 841
  • [9] Semi-supervised cross-modal learning for cross modal retrieval and image annotation
    Zou, Fuhao
    Bai, Xingqiang
    Luan, Chaoyang
    Li, Kai
    Wang, Yunfei
    Ling, Hefei
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (02): : 825 - 841
  • [10] Clustering-Based Semi-Supervised Cross-Modal Retrieval Using Scene Graph
    Kong, Yixue
    Feng, Yong
    Zhou, Mingliang
    Xiong, Xiancai
    Wang, Yongheng
    Qiang, Baohua
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (12) : 1299 - 1314