Cross-modal semantic correlation learning by Bi-CNN network

被引:4
|
作者
Wang, Chaoyi [1 ]
Li, Liang [2 ]
Yan, Chenggang [1 ]
Wang, Zhan [3 ]
Sun, Yaoqi [1 ]
Zhang, Jiyong [1 ]
机构
[1] Hangzhou Dianzi Univ, Hangzhou, Peoples R China
[2] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[3] RTInvent Technol Co Ltd, Beijing, Peoples R China
关键词
REPRESENTATION;
D O I
10.1049/ipr2.12176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross modal retrieval can retrieve images through a text query and vice versa. In recent years, cross modal retrieval has attracted extensive attention. The purpose of most now available cross modal retrieval methods is to find a common subspace and maximize the different modal correlation. To generate specific representations consistent with cross modal tasks, this paper proposes a novel cross modal retrieval framework, which integrates feature learning and latent space embedding. In detail, we proposed a deep CNN and a shallow CNN to extract the feature of the samples. The deep CNN is used to extract the representation of images, and the shallow CNN uses a multi-dimensional kernel to extract multi-level semantic representation of text. Meanwhile, we enhance the semantic manifold by constructing cross modal ranking and within-modal discriminant loss to improve the division of semantic representation. Moreover, the most representative samples are selected by using online sampling strategy, so that the approach can be implemented on a large-scale data. This approach not only increases the discriminative ability among different categories, but also maximizes the relativity between different modalities. Experiments on three real word datasets show that the proposed method is superior to the popular methods.
引用
收藏
页码:3674 / 3684
页数:11
相关论文
共 50 条
  • [1] Cross-Modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation
    Hua, Yan
    Wang, Shuhui
    Liu, Siyuan
    Cai, Anni
    Huang, Qingming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (06) : 1201 - 1216
  • [2] Deep Semantic Correlation with Adversarial Learning for Cross-Modal Retrieval
    Hua, Yan
    Du, Jianhe
    PROCEEDINGS OF 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2019), 2019, : 252 - 255
  • [3] Modal-adversarial Semantic Learning Network for Extendable Cross-modal Retrieval
    Xu, Xing
    Song, Jingkuan
    Lu, Huimin
    Yang, Yang
    Shen, Fumin
    Huang, Zi
    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 46 - 54
  • [4] TINA: Cross-modal Correlation Learning by Adaptive Hierarchical Semantic Aggregation
    Hua, Yan
    Wang, Shuhui
    Liu, Siyuan
    Huang, Qingming
    Cai, Anni
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 190 - 199
  • [5] Analyzing semantic correlation for cross-modal retrieval
    Liang Xie
    Peng Pan
    Yansheng Lu
    Multimedia Systems, 2015, 21 : 525 - 539
  • [6] Analyzing semantic correlation for cross-modal retrieval
    Xie, Liang
    Pan, Peng
    Lu, Yansheng
    MULTIMEDIA SYSTEMS, 2015, 21 (06) : 525 - 539
  • [7] Learning Shared Semantic Space with Correlation Alignment for Cross-Modal Event Retrieval
    Yang, Zhenguo
    Lin, Zehang
    Kang, Peipei
    Lv, Jianming
    Li, Qing
    Liu, Wenyin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
  • [8] Adversarial Learning-Based Semantic Correlation Representation for Cross-Modal Retrieval
    Zhu, Lei
    Song, Jiayu
    Zhu, Xiaofeng
    Zhang, Chengyuan
    Zhang, Shichao
    Yuan, Xinpan
    IEEE MULTIMEDIA, 2020, 27 (04) : 79 - 90
  • [9] Jointly Learning Attentions with Semantic Cross-Modal Correlation for Visual Question Answering
    Cao, Liangfu
    Gao, Lianli
    Song, Jingkuan
    Xu, Xing
    Shen, Heng Tao
    DATABASES THEORY AND APPLICATIONS, ADC 2017, 2017, 10538 : 248 - 260
  • [10] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126