Local-enhanced representation for text-based person search

被引：3

作者：

Zhang, Guoqing ^{[1
,2
]}

Chen, Yuhao ^{[1
]}

Zheng, Yuhui ^{[1
]}

Martin, Gaven ^{[3
]}

Wang, Ruili ^{[2
,4
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing, Peoples R China

[2] Massey Univ, Sch Math & Computat Sci, Auckland, New Zealand

[3] Massey Univ, Inst Adv Study, Auckland, New Zealand

[4] Univ Nottingham Ningbo China, Sch Comp Sci, Ningbo, Peoples R China

来源：

PATTERN RECOGNITION | 2025年 / 161卷

基金：

中国国家自然科学基金;

关键词：

Person re-identification; Cross-modal retrieval; Local representation;

D O I：

10.1016/j.patcog.2024.111247

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-based person search is a critical task in intelligent security, designed to locate a person of interest by text descriptions. The primary challenge in this task is to effectively bridge the significant gap between the text and image domains while simultaneously extracting the discriminative features that are crucial for the accurate identification of individuals. Existing methods have made some effective attempts by conducting cross-modal matching at the fine-grained representation level. However, these approaches frequently overlook two crucial factors: (i) the presence of noise in the local features during information fusion, and (ii) the lack of intra-modal matching when measuring feature similarity. To address the above issues, we propose a novel local- enhanced representation framework in this paper. Specifically, to restrain noises in local features, we design a Relation-based cross-modal local-enhanced fusion module, which can filter out weak related information by relation assessment. In addition, we explore an intra-cross modal projection strategy to overcome the limitations of existing cross-modal projection methods. This strategy jointly applies the intra-modal and cross- modal matching constrains in feature distribution. Finally, experiments on three mainstream datasets verify the performance superiority of our proposed method compared to existing state-of-the-art methods.

引用

页数：12

共 50 条

[31] Person re-identification by enhanced local maximal occurrence representation and generalized similarity metric learning [J].

Dong, Husheng ;

Lu, Ping ;

Zhong, Shan ;

Liu, Chunping ;

Ji, Yi ;

Gong, Shengrong .

NEUROCOMPUTING, 2018, 307 :25-37

[32] Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold [J].

Wang, Zijie ;

Zhu, Aichun ;

Xue, Jingyi ;

Wan, Xili ;

Liu, Chao ;

Wang, Tian ;

Li, Yifeng .

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, :1984-1992

[33] Local Sparse Representation Based Interest Point Matching for Person Re-identification [J].

Ibn Khedher, Mohamed ;

El Yacoubi, Mounim A. .

NEURAL INFORMATION PROCESSING, PT III, 2015, 9491 :241-250

[34] Address the Unseen Relationships: Attribute Correlations in Text Attribute Person Search [J].

Yang, Xi ;

Wang, Xiaoqi ;

Wang, Nannan ;

Gao, Xinbo .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) :16916-16926

[35] Joint discriminative representation learning for end-to-end person search [J].

Zhang, Pengcheng ;

Yu, Xiaohan ;

Bai, Xiao ;

Wang, Chen ;

Zheng, Jin ;

Ning, Xin .

PATTERN RECOGNITION, 2024, 147

[36] Phasic Maximal and Local Maximal Occurrence Representation for Video-Based Person Re-identification [J].

Liu, Gang ;

Tian, Chang ;

Wu, Ze-Min .

2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, :1187-1190

[37] Improved Local Maximal Occurrence Feature Representation for Person Re-identification [J].

Zhu, Yuanxin ;

Yang, Zhao ;

Yao, Xuwen ;

Zhao, Sai ;

Peng, Shaohu .

PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, :602-606

[38] Person Search Based on Improved Joint Learning Network [J].

Zhang, Huimei ;

Chen, Changhong ;

Gan, Zongliang .

PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2019), 2019,

[39] Text-to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network [J].

Han, Guang ;

Lin, Min ;

Li, Ziyang ;

Zhao, Haitao ;

Kwong, Sam .

IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :6025-6036

[40] Extended Global-Local Representation Learning for Video Person Re-Identification [J].

Song, Wanru ;

Wu, Yahong ;

Zheng, Jieying ;

Chen, Changhong ;

Liu, Feng .

IEEE ACCESS, 2019, 7 :122684-122696

← 1 2 3 4 5 →