Local-enhanced representation for text-based person search

被引:3
作者
Zhang, Guoqing [1 ,2 ]
Chen, Yuhao [1 ]
Zheng, Yuhui [1 ]
Martin, Gaven [3 ]
Wang, Ruili [2 ,4 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing, Peoples R China
[2] Massey Univ, Sch Math & Computat Sci, Auckland, New Zealand
[3] Massey Univ, Inst Adv Study, Auckland, New Zealand
[4] Univ Nottingham Ningbo China, Sch Comp Sci, Ningbo, Peoples R China
基金
中国国家自然科学基金;
关键词
Person re-identification; Cross-modal retrieval; Local representation;
D O I
10.1016/j.patcog.2024.111247
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-based person search is a critical task in intelligent security, designed to locate a person of interest by text descriptions. The primary challenge in this task is to effectively bridge the significant gap between the text and image domains while simultaneously extracting the discriminative features that are crucial for the accurate identification of individuals. Existing methods have made some effective attempts by conducting cross-modal matching at the fine-grained representation level. However, these approaches frequently overlook two crucial factors: (i) the presence of noise in the local features during information fusion, and (ii) the lack of intra-modal matching when measuring feature similarity. To address the above issues, we propose a novel local- enhanced representation framework in this paper. Specifically, to restrain noises in local features, we design a Relation-based cross-modal local-enhanced fusion module, which can filter out weak related information by relation assessment. In addition, we explore an intra-cross modal projection strategy to overcome the limitations of existing cross-modal projection methods. This strategy jointly applies the intra-modal and cross- modal matching constrains in feature distribution. Finally, experiments on three mainstream datasets verify the performance superiority of our proposed method compared to existing state-of-the-art methods.
引用
收藏
页数:12
相关论文
共 50 条
[41]   Person Re-Identification by Spatial Pyramid Color Representation and Local Region Matching [J].
Liu, Chunxiao ;
Wang, Guijin ;
Lin, Xinggang ;
Li, Liang .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (08) :2154-2157
[42]   Correlation Based Identity Filter: An Efficient Framework for Person Search [J].
Li, Wei-Hong ;
Mao, Yafang ;
Wu, Ancong ;
Zheng, Wei-Shi .
IMAGE AND GRAPHICS (ICIG 2017), PT I, 2017, 10666 :250-261
[43]   Dynamic imposter based online instance matching for person search [J].
Dai, Ju ;
Zhang, Pingping ;
Lu, Huchuan ;
Wang, Hongyu .
PATTERN RECOGNITION, 2020, 100
[44]   Symbiotic Adversarial Learning for Attribute-Based Person Search [J].
Cao, Yu-Tong ;
Wang, Jingya ;
Tao, Dacheng .
COMPUTER VISION - ECCV 2020, PT XIV, 2020, 12359 :230-247
[45]   Hybrid Attention Network for Language-Based Person Search [J].
Li, Yang ;
Xu, Huahu ;
Xiao, Junsheng .
SENSORS, 2020, 20 (18) :1-23
[46]   Text Based Unsupervised Domain Generalization Person Re-identification [J].
Zhang, Guoqing ;
Jin, Tong ;
Liu, Tianqi .
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XV, 2025, 15045 :377-391
[47]   A local representation-enhanced recurrent convolutional network for image captioning [J].
Wang, Xiaoyi ;
Huang, Jun .
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (02) :149-157
[48]   A local representation-enhanced recurrent convolutional network for image captioning [J].
Xiaoyi Wang ;
Jun Huang .
International Journal of Multimedia Information Retrieval, 2022, 11 :149-157
[49]   Transformer-Enhanced Visual-Semantic Representation for Text-Image Retrieval [J].
Zhang, Meng ;
Wu, Wei ;
Zhang, Haotian .
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, :2042-2048
[50]   Person Re-Identification Based on Spatiogram Descriptor and Collaborative Representation [J].
Tian, Chang ;
Zeng, Mingyong ;
Wu, Zemin .
IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (10) :1595-1599