Full-view salient feature mining and alignment for text-based person search

被引：4

作者：

Xie, Sheng ^{[1
]}

Zhang, Canlong ^{[1
,2
]}

Ning, Enhao ^{[1
]}

Li, Zhixin ^{[1
,2
]}

Wang, Zhiwen ^{[3
]}

Wei, Chunrong ^{[4
]}

机构：

[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China

[2] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541004, Peoples R China

[3] Guangxi Univ Sci & Technol, Sch Comp Sci & Technol, Liuzhou 545006, Peoples R China

[4] Guangxi Normal Univ, Teachers Coll Vocat & Tech Educ, Guilin 541004, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 251卷

基金：

中国国家自然科学基金;

关键词：

Text-based person search; Diffusion; Full-view; Generation; Text attention; OPTIMIZATION; NETWORK;

D O I：

10.1016/j.eswa.2024.124071

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-based person search aims to retrieve relevant person images from a large database given textual queries. However, single-view limitation of surveillance cameras and cross-modal heterogeneity still remain challenging open issues. To address these, we propose a F ul l -view S a lient Feature Mining N etwork (FLAN) to improve text-image matching in this task. Our FLAN introduces two key innovations. First, the Diffusion-based Fullview Image Augmentation generates informative full-view data from a single image to simulate human visual observation and learn view-invariant features. Second, the Dual-max Text Attention module optimizes spatial and channel-wise text attentions to extract the most discriminative words characterizing the person. Together, these innovations handle insufficient, imbalanced, and heterogeneous data for more accurate matching. Extensive experiments on three text-based person search datasets, CUHK-PEDES, ICFG-PEDES and RSTPReid, demonstrate superior performance of our FLAN with improved robustness and generalization.

引用

页数：13

共 44 条

[21] Improving embedding learning by virtual attribute decoupling for text-based person search
Chengji Wang
Zhiming Luo
Yaojin Lin
Shaozi Li
Neural Computing and Applications, 2022, 34 : 5625 - 5647
[22] TIPCB: A simple but effective part-based convolutional baseline for text-based person search
Chen, Yuhao
Zhang, Guoqing
Lu, Yujiang
Wang, Zhenxing
Zheng, Yuhui
NEUROCOMPUTING, 2022, 494 : 171 - 181
[23] Learning shared features from specific and ambiguous descriptions for text-based person search
Ke Cheng
Qikai Geng
Shucheng Huang
Juanjuan Tu
Hu Lu
Multimedia Systems, 2024, 30
[24] PLOT: Text-Based Person Search with Part Slot Attention for Corresponding Part Discovery
Park, Jicheol
Kim, Dongwon
Jeong, Boseung
Kwak, Suha
COMPUTER VISION - ECCV 2024, PT XXI, 2025, 15079 : 474 - 490
[25] Learning shared features from specific and ambiguous descriptions for text-based person search
Cheng, Ke
Geng, Qikai
Huang, Shucheng
Tu, Juanjuan
Lu, Hu
MULTIMEDIA SYSTEMS, 2024, 30 (02)
[26] VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
He, Shuting
Luo, Hao
Jiang, Wei
Jiang, Xudong
Ding, Henghui
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 163 - 176
[27] PaSeMix: A Multi-modal Partitional Semantic Data Augmentation Method for Text-Based Person Search
Yuan, Xinpan
Li, Jiabao
Gan, Wenguang
Xia, Wei
Weng, Yanbin
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 : 468 - 479
[28] Multi-granularity relation-aware and conditional query learning for text-based person search
Wang, Xiaoyong
Yang, Jianxi
JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)
[29] SUM: Serialized Updating and Matching for text-based person retrieval
Wang, Zijie
Zhu, Aichun
Xue, Jingyi
Jiang, Daihong
Liu, Chao
Li, Yifeng
Hu, Fangqiang
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[30] Full-view low-cost LED-based optoacoustic tomography
Liu, Xiang
Kalva, Sandeep Kumar
Lafci, Berkan
Nozdriukhin, Daniil
Dean-Ben, Xose Luis
Razansky, Daniel
PHOTONS PLUS ULTRASOUND: IMAGING AND SENSING 2024, 2024, 12842

← 1 2 3 4 5 →