An Overview of Text-Based Person Search: Recent Advances and Future Directions

被引:1
作者
Niu, Kai [1 ,2 ]
Liu, Yanyi [1 ]
Long, Yuzhou [1 ]
Huang, Yan [3 ]
Wang, Liang [3 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated AeroSpace Ground Ocean B, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ Shenzhen, Inst Res & Dev, Shenzhen 518063, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Text-based person search; cross-modal retrieval; video surveillance; feature extraction; semantic alignments; NEURAL-NETWORK; ATTENTION NETWORK; IMAGE; TRANSFORMER;
D O I
10.1109/TCSVT.2024.3376373
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Due to the practical significance in smart video surveillance systems, Text-Based Person Search (TBPS) has been one of the research hotspots recently, which refers to searching for the interested pedestrian images given natural language sentences. To help researchers quickly grasp the developments of this important task, we comprehensively summarize the recent research advances of TBPS from two perspectives, i.e., Feature Extraction (FE) and Semantic Alignments (SA). Specifically, the FE mainly consists of pre-processing approaches and end-to-end frameworks, and the SA could be briefly divided into cross-modal attention mechanism, non-attention alignments, training objectives, and generative approaches. Afterwards, we elaborate four widely-used benchmarks and also the evaluation criterion for TBPS. And comparisons and analyses among the state-of-the-art (SOTA) solutions are provided based on these large-scale benchmarks. At last, we point out some future research directions that need to be further addressed, which will greatly facilitate the practical applications of TBPS.
引用
收藏
页码:7803 / 7819
页数:17
相关论文
共 50 条
[31]   Enhancing visual representation for text-based person searching [J].
Shen, Wei ;
Fang, Ming ;
Wang, Yuxia ;
Xiao, Jiafeng ;
Li, Diping ;
Chen, Huangqun ;
Xu, Ling ;
Zhang, Weifeng .
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[32]   PaSeMix: A Multi-modal Partitional Semantic Data Augmentation Method for Text-Based Person Search [J].
Yuan, Xinpan ;
Li, Jiabao ;
Gan, Wenguang ;
Xia, Wei ;
Weng, Yanbin .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 :468-479
[33]   LAIP: Learning Local Alignment from Image-Phrase Modeling for Text-based Person Search [J].
Wut, Yu ;
Wang, Haiguang ;
Wu, Mengxia ;
Cao, Min ;
Zhang, Min .
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
[34]   Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval [J].
Li, Jiayi ;
Jiang, Min ;
Kong, Jun ;
Tao, Xuefeng ;
Luo, Xi .
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :10678-10691
[35]   SUM: Serialized Updating and Matching for text-based person retrieval [J].
Wang, Zijie ;
Zhu, Aichun ;
Xue, Jingyi ;
Jiang, Daihong ;
Liu, Chao ;
Li, Yifeng ;
Hu, Fangqiang .
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[36]   DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval [J].
Zhu, Aichun ;
Wang, Zijie ;
Li, Yifeng ;
Wan, Xili ;
Jin, Jing ;
Wang, Tian ;
Hu, Fangqiang ;
Hua, Gang .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :209-217
[37]   Addressing Information Inequality for Text-Based Person Search via Pedestrian-Centric Visual Denoising and Bias-Aware Alignments [J].
Gao, Liying ;
Niu, Kai ;
Jiao, Bingliang ;
Wang, Peng ;
Zhang, Yanning .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) :7884-7899
[38]   From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search [J].
Sun, Jintao ;
Fei, Hao ;
Ding, Gangyi ;
Zheng, Zhedong .
PROCEEDINGS OF THE ACM WEB CONFERENCE 2025, WWW 2025, 2025, :2341-2351
[39]   Text-Based Person re-ID by Saliency Mask and Dynamic Label Smoothing [J].
Pang, Yonghua ;
Zhang, Canlong ;
Li, Zhixin ;
Hu, Liaojie .
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT V, 2024, 14451 :443-454
[40]   Deep Learning on Network Traffic Prediction: Recent Advances, Analysis, and Future Directions [J].
Aouedi, Ons ;
Le, Van An ;
Piamrat, Kandaraj ;
Ji Yusheng .
ACM COMPUTING SURVEYS, 2025, 57 (06)