Combining Global and Local Similarity for Cross-Media Retrieval

被引:20
作者
Li, Zhixin [1 ]
Ling, Feng [1 ]
Zhang, Canlong [1 ]
Ma, Huifang [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[2] Northwest Normal Univ, Coll Comp Sci & Engn, Lanzhou 730070, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
基金
中国国家自然科学基金;
关键词
Convolutional neural network; self-attention network; attention mechanism; two-level network; cross-media retrieval;
D O I
10.1109/ACCESS.2020.2969808
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper mainly studies the problem of image-text matching in order to make image and text better match. Existing cross-media retrieval methods only make use of the information of image and part of text, that is, matching the whole image with the whole sentence, or matching some image areas with some words. In order to better reveal the potential connection between image and text semantics, this paper proposes a fusion of two levels of similarity across media images-text retrieval method, constructed the cross-media two-level network to explore the better matching between images and texts, it contains two subnets for dealing with global features and local characteristics. Specifically, in this method, the image is divided into the whole picture and some image area, the text is divided into the whole sentences and words, to study respectively, to explore the full potential alignment of images and text, and then use a two-level alignment framework is used to promote each other, fusion of two kinds of similarity can learn to complete representation of cross-media retrieval. Through the experimental evaluation on Flickr30K and MS-COCO datasets, the results show that the method in this paper can make the semantic matching of image and text more accurate, and is superior to the international popular cross-media retrieval method in various evaluation indexes.
引用
收藏
页码:21847 / 21856
页数:10
相关论文
共 50 条
  • [41] A Novel Approach Towards Large Scale Cross-Media Retrieval
    逯波
    王国仁
    袁野
    JournalofComputerScience&Technology, 2012, 27 (06) : 1140 - 1149
  • [42] Discriminative Coupled Dictionary Hashing for Fast Cross-Media Retrieval
    Yu, Zhou
    Wu, Fei
    Yang, Yi
    Tian, Qi
    Luo, Jiebo
    Zhuang, Yueting
    SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 395 - 404
  • [43] Cross-media retrieval by intra-media and inter-media correlation mining
    Zhai, Xiaohua
    Peng, Yuxin
    Xiao, Jianguo
    MULTIMEDIA SYSTEMS, 2013, 19 (05) : 395 - 406
  • [44] Cross-media retrieval by intra-media and inter-media correlation mining
    Xiaohua Zhai
    Yuxin Peng
    Jianguo Xiao
    Multimedia Systems, 2013, 19 : 395 - 406
  • [45] An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges
    Peng, Yuxin
    Huang, Xin
    Zhao, Yunzhen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (09) : 2372 - 2385
  • [46] A grid-based framework for pervasive cross-media retrieval
    Zhang, Hong
    Zhuang, Yueting
    Wu, Fei
    2006 1ST INTERNATIONAL SYMPOSIUM ON PERVASIVE COMPUTING AND APPLICATIONS, PROCEEDINGS, 2006, : 183 - +
  • [47] Unsupervised Concept Learning in Text Subspace for Cross-Media Retrieval
    Fan, Mengdi
    Wang, Wenmin
    Dong, Peilei
    Wang, Ronggang
    Li, Ge
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 505 - 514
  • [48] Joint Graph Regularization in a Homogeneous Subspace for Cross-Media Retrieval
    Qi, Yudan
    Zhang, Huaxiang
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2019, 23 (05) : 939 - 946
  • [49] A Novel Approach Towards Large Scale Cross-Media Retrieval
    Bo Lu
    Guo-Ren Wang
    Ye Yuan
    Journal of Computer Science and Technology, 2012, 27 : 1140 - 1149
  • [50] A Novel Approach Towards Large Scale Cross-Media Retrieval
    Lu, Bo
    Wang, Guo-Ren
    Yuan, Ye
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (06) : 1140 - 1149