Full-view salient feature mining and alignment for text-based person search

被引:4
|
作者
Xie, Sheng [1 ]
Zhang, Canlong [1 ,2 ]
Ning, Enhao [1 ]
Li, Zhixin [1 ,2 ]
Wang, Zhiwen [3 ]
Wei, Chunrong [4 ]
机构
[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Guangxi Univ Sci & Technol, Sch Comp Sci & Technol, Liuzhou 545006, Peoples R China
[4] Guangxi Normal Univ, Teachers Coll Vocat & Tech Educ, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-based person search; Diffusion; Full-view; Generation; Text attention; OPTIMIZATION; NETWORK;
D O I
10.1016/j.eswa.2024.124071
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-based person search aims to retrieve relevant person images from a large database given textual queries. However, single-view limitation of surveillance cameras and cross-modal heterogeneity still remain challenging open issues. To address these, we propose a F ul l -view S a lient Feature Mining N etwork (FLAN) to improve text-image matching in this task. Our FLAN introduces two key innovations. First, the Diffusion-based Fullview Image Augmentation generates informative full-view data from a single image to simulate human visual observation and learn view-invariant features. Second, the Dual-max Text Attention module optimizes spatial and channel-wise text attentions to extract the most discriminative words characterizing the person. Together, these innovations handle insufficient, imbalanced, and heterogeneous data for more accurate matching. Extensive experiments on three text-based person search datasets, CUHK-PEDES, ICFG-PEDES and RSTPReid, demonstrate superior performance of our FLAN with improved robustness and generalization.
引用
收藏
页数:13
相关论文
共 44 条
  • [21] Improving embedding learning by virtual attribute decoupling for text-based person search
    Chengji Wang
    Zhiming Luo
    Yaojin Lin
    Shaozi Li
    Neural Computing and Applications, 2022, 34 : 5625 - 5647
  • [22] TIPCB: A simple but effective part-based convolutional baseline for text-based person search
    Chen, Yuhao
    Zhang, Guoqing
    Lu, Yujiang
    Wang, Zhenxing
    Zheng, Yuhui
    NEUROCOMPUTING, 2022, 494 : 171 - 181
  • [23] Learning shared features from specific and ambiguous descriptions for text-based person search
    Ke Cheng
    Qikai Geng
    Shucheng Huang
    Juanjuan Tu
    Hu Lu
    Multimedia Systems, 2024, 30
  • [24] PLOT: Text-Based Person Search with Part Slot Attention for Corresponding Part Discovery
    Park, Jicheol
    Kim, Dongwon
    Jeong, Boseung
    Kwak, Suha
    COMPUTER VISION - ECCV 2024, PT XXI, 2025, 15079 : 474 - 490
  • [25] Learning shared features from specific and ambiguous descriptions for text-based person search
    Cheng, Ke
    Geng, Qikai
    Huang, Shucheng
    Tu, Juanjuan
    Lu, Hu
    MULTIMEDIA SYSTEMS, 2024, 30 (02)
  • [26] VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
    He, Shuting
    Luo, Hao
    Jiang, Wei
    Jiang, Xudong
    Ding, Henghui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 163 - 176
  • [27] PaSeMix: A Multi-modal Partitional Semantic Data Augmentation Method for Text-Based Person Search
    Yuan, Xinpan
    Li, Jiabao
    Gan, Wenguang
    Xia, Wei
    Weng, Yanbin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 : 468 - 479
  • [28] Multi-granularity relation-aware and conditional query learning for text-based person search
    Wang, Xiaoyong
    Yang, Jianxi
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
  • [29] SUM: Serialized Updating and Matching for text-based person retrieval
    Wang, Zijie
    Zhu, Aichun
    Xue, Jingyi
    Jiang, Daihong
    Liu, Chao
    Li, Yifeng
    Hu, Fangqiang
    KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [30] Full-view low-cost LED-based optoacoustic tomography
    Liu, Xiang
    Kalva, Sandeep Kumar
    Lafci, Berkan
    Nozdriukhin, Daniil
    Dean-Ben, Xose Luis
    Razansky, Daniel
    PHOTONS PLUS ULTRASOUND: IMAGING AND SENSING 2024, 2024, 12842