An Overview of Text-based Person Search: Recent Advances and Future Directions

被引:0
|
作者
Niu K. [1 ]
Liu Y. [1 ]
Long Y. [1 ]
Huang Y. [3 ]
Wang L. [3 ]
Zhang Y. [1 ]
机构
[1] Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing
基金
中国国家自然科学基金;
关键词
Benchmark testing; cross-modal retrieval; feature extraction; Feature extraction; Pedestrians; semantic alignments; Semantics; Task analysis; Text-based person search; Training; video surveillance; Visualization;
D O I
10.1109/TCSVT.2024.3376373
中图分类号
学科分类号
摘要
Due to the practical significance in smart video surveillance systems, Text-Based Person Search (TBPS) has been one of the research hotspots recently, which refers to searching for the interested pedestrian images given natural language sentences. To help researchers quickly grasp the developments of this important task, we comprehensively summarize the recent research advances of TBPS from two perspectives, <italic>i.e</italic>., Feature Extraction (FE) and Semantic Alignments (SA). Specifically, the FE mainly consists of pre-processing approaches and end-to-end frameworks, and the SA could be briefly divided into cross-modal attention mechanism, non-attention alignments, training objectives, and generative approaches. Afterwards, we elaborate four widely-used benchmarks and also the evaluation criterion for TBPS. And comparisons and analyses among the state-of-the-art (SOTA) solutions are provided based on these large-scale benchmarks. At last, we point out some future research directions that need to be further addressed, which will greatly facilitate the practical applications of TBPS. IEEE
引用
收藏
页码:1 / 1
相关论文
共 50 条
  • [1] Multi-Granularity Matching Transformer for Text-Based Person Search
    Bao, Liping
    Wei, Longhui
    Zhou, Wengang
    Liu, Lin
    Xie, Lingxi
    Li, Houqiang
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4281 - 4293
  • [2] Conditional Feature Learning Based Transformer for Text-Based Person Search
    Gao, Chenyang
    Cai, Guanyu
    Jiang, Xinyang
    Zheng, Feng
    Zhang, Jun
    Gong, Yifei
    Lin, Fangzhou
    Sun, Xing
    Bai, Xiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6097 - 6108
  • [3] Joint Token and Feature Alignment Framework for Text-Based Person Search
    Li, Shangze
    Lu, Andong
    Huang, Yan
    Li, Chenglong
    Wang, Liang
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2238 - 2242
  • [4] Cross-Modal Feature Fusion-Based Knowledge Transfer for Text-Based Person Search
    You, Kaiyang
    Chen, Wenjing
    Wang, Chengji
    Sun, Hao
    Xie, Wei
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2230 - 2234
  • [5] Text-Guided Visual Feature Refinement for Text-Based Person Search
    Gao, Liying
    Niu, Kai
    Ma, Zehong
    Jiao, Bingliang
    Tan, Tonghao
    Wang, Peng
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 118 - 126
  • [6] Learning Semantic Polymorphic Mapping for Text-Based Person Retrieval
    Li, Jiayi
    Jiang, Min
    Kong, Jun
    Tao, Xuefeng
    Luo, Xi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10678 - 10691
  • [7] Feature semantic alignment and information supplement for Text-based person search
    Zhou, Hang
    Li, Fan
    Tian, Xuening
    Huang, Yuling
    FRONTIERS IN PHYSICS, 2023, 11
  • [8] FedSH: Towards Privacy-Preserving Text-Based Person Re-Identification
    Ma, Wentao
    Wu, Xinyi
    Zhao, Shan
    Zhou, Tongqing
    Guo, Dan
    Gu, Lichuan
    Cai, Zhiping
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5065 - 5077
  • [9] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
    Zhu, Aichun
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Jin, Jing
    Wang, Tian
    Snoussi, Hichem
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
  • [10] Cross-modal alignment with synthetic caption for text-based person search
    Weichen Zhao
    Yuxing Lu
    Zhiyuan Liu
    Yuan Yang
    Ge Jiao
    International Journal of Multimedia Information Retrieval, 2025, 14 (2)