A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

被引:4
|
作者
Shang, Heng [1 ]
Zhao, Guoshuai [1 ]
Shi, Jing [1 ]
Qian, Xueming [2 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, SMILES Lab, Xian 710049, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Feature extraction; Semantics; Text mining; Intelligent systems; Image representation; Task analysis; Image edge detection;
D O I
10.1109/MIS.2023.3265176
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In image-text matching fields, one of the keys to improving performance is to extract features with more semantic information. Existing works demonstrate that semantic enrichment through knowledge expansion can improve performance. Most of them expand image features, however, the shortage of semantic information in text modality and the unilateral character of the view are often bottlenecks that limit the performance of image-text matching models. To solve the two problems, we aggregate knowledge from multiple views and propose a word imagination graph (WIG). A WIG can be used to expand textual semantic information by imagination based on input images. Then, utilizing WIG, we construct a novel multiview text imagination network (MTIN). A MTIN enables latent alignment of images and texts on tags, which can assist matching on a semantic level. Results from the Flickr30K and MS-COCO datasets demonstrate the effectiveness of our method. The source code has been released on GitHub https://github.com/smileslabsh/Multiview-Text-Imagination-Network.
引用
收藏
页码:41 / 50
页数:10
相关论文
共 50 条
  • [31] PFAN plus plus : Bi-Directional Image-Text Retrieval With Position Focused Attention Network
    Wang, Yaxiong
    Yang, Hao
    Bai, Xiuxiu
    Qian, Xueming
    Ma, Lin
    Lu, Jing
    Li, Biao
    Fan, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 (23) : 3362 - 3376
  • [32] Image-Text Retrieval With Cross-Modal Semantic Importance Consistency
    Liu, Zejun
    Chen, Fanglin
    Xu, Jun
    Pei, Wenjie
    Lu, Guangming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2465 - 2476
  • [33] Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image Segmentation
    Lei, Sen
    Xiao, Xinyu
    Zhang, Tianlin
    Li, Heng-Chao
    Shi, Zhenwei
    Zhu, Qing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [34] An Image-Text Dual-Channel Union Network for Person Re-Identification
    Qi, Baoguang
    Chen, Yi
    Liu, Qiang
    He, Xiaohai
    Qing, Linbo
    Sheriff, Ray E.
    Chen, Honggang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72 : 1 - 16
  • [35] Visual Global-Salient-Guided Network for Remote Sensing Image-Text Retrieval
    He, Yangpeng
    Xu, Xin
    Chen, Hongjia
    Li, Jinwen
    Pu, Fangling
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [36] Memorize, Associate and Match: Embedding Enhancement via Fine-Grained Alignment for Image-Text Retrieval
    Li, Jiangtong
    Liu, Liu
    Niu, Li
    Zhang, Liqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 9193 - 9207
  • [37] Global-Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image-Text Retrieval
    Hu, Gang
    Wen, Zaidao
    Lv, Yafei
    Zhang, Jianting
    Wu, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [38] Causal Inference for Leveraging Image-Text Matching Bias in Multi-Modal Fake News Detection
    Hu, Linmei
    Chen, Ziwei
    Zhao, Ziwang
    Yin, Jianhua
    Nie, Liqiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11141 - 11152
  • [39] Dual-Level Representation Enhancement on Characteristic and Context for Image-Text Retrieval
    Yang, Song
    Li, Qiang
    Li, Wenhui
    Li, Xuanya
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 8037 - 8050
  • [40] Feature First: Advancing Image-Text Retrieval Through Improved Visual Features
    Wu, Dongqing
    Li, Huihui
    Gu, Cang
    Liu, Hang
    Xu, Cuili
    Hou, Yinxuan
    Guo, Lei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3827 - 3841