Full-view salient feature mining and alignment for text-based person search

被引:4
|
作者
Xie, Sheng [1 ]
Zhang, Canlong [1 ,2 ]
Ning, Enhao [1 ]
Li, Zhixin [1 ,2 ]
Wang, Zhiwen [3 ]
Wei, Chunrong [4 ]
机构
[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China
[3] Guangxi Univ Sci & Technol, Sch Comp Sci & Technol, Liuzhou 545006, Peoples R China
[4] Guangxi Normal Univ, Teachers Coll Vocat & Tech Educ, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-based person search; Diffusion; Full-view; Generation; Text attention; OPTIMIZATION; NETWORK;
D O I
10.1016/j.eswa.2024.124071
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-based person search aims to retrieve relevant person images from a large database given textual queries. However, single-view limitation of surveillance cameras and cross-modal heterogeneity still remain challenging open issues. To address these, we propose a F ul l -view S a lient Feature Mining N etwork (FLAN) to improve text-image matching in this task. Our FLAN introduces two key innovations. First, the Diffusion-based Fullview Image Augmentation generates informative full-view data from a single image to simulate human visual observation and learn view-invariant features. Second, the Dual-max Text Attention module optimizes spatial and channel-wise text attentions to extract the most discriminative words characterizing the person. Together, these innovations handle insufficient, imbalanced, and heterogeneous data for more accurate matching. Extensive experiments on three text-based person search datasets, CUHK-PEDES, ICFG-PEDES and RSTPReid, demonstrate superior performance of our FLAN with improved robustness and generalization.
引用
收藏
页数:13
相关论文
共 44 条
  • [31] Mirror-Based Full-View Finger Vein Authentication With Illumination Adaptation
    Huang, Junduan
    Li, Zifeng
    Bhattacharjee, Sushil
    Marcel, Sebastien
    Kang, Wenxiong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2057 - 2073
  • [32] A cross-view intelligent person search method based on multi-feature constraints
    Zhu, Jun
    Zhang, Jinbin
    Chen, Hongyu
    Xie, Yakun
    Gu, Hengchao
    Lian, Huijie
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
  • [33] Addressing Information Inequality for Text-Based Person Search via Pedestrian-Centric Visual Denoising and Bias-Aware Alignments
    Gao, Liying
    Niu, Kai
    Jiao, Bingliang
    Wang, Peng
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7884 - 7899
  • [34] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
    Zhu, Aichun
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Snoussi, Hichem
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
  • [35] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
    Zhu, Aichun
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Snoussi, Hichem
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (03) : 5097 - 5111
  • [36] SIAMCLIM: TEXT-BASED PEDESTRIAN SEARCH VIA MULTI-MODAL SIAMESE CONTRASTIVE LEARNING
    Huang, Runlin
    Wu, Shuyang
    Jie, Leiping
    Zuo, Xinxin
    Zhang, Hui
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1800 - 1804
  • [37] EESSO: Exploiting Extreme and Smooth Signals via Omni-frequency learning for Text-based Person Retrieval
    Xue, Jingyi
    Wang, Zijie
    Dong, Guan-Nan
    Zhu, Aichun
    IMAGE AND VISION COMPUTING, 2024, 142
  • [38] Learning Weak Semantics by Feature Graph for Attribute-Based Person Search
    Peng, Qiyang
    Yang, Lingxiao
    Xie, Xiaohua
    Lai, Jianhuang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2580 - 2592
  • [39] Multimodal Alignment and Attention-Based Person Search via Natural Language Description
    Ji, Zhong
    Li, Shengjia
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 11147 - 11156
  • [40] Study on Fine-Grained View Mining based on Network Subjective Text
    Wang, Liping
    Liu, Yanling
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON SOCIAL SCIENCE, EDUCATION AND HUMANITIES RESEARCH, 2016, 69 : 1385 - 1388