An Overview of Text-Based Person Search: Recent Advances and Future Directions

被引:0
作者
Niu K. [1 ,2 ]
Liu Y. [1 ]
Long Y. [1 ]
Huang Y. [3 ]
Wang L. [3 ]
Zhang Y. [1 ]
机构
[1] Northwestern Polytechnical University, National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science, Xi'an
[2] Research & Development Institute of Northwestern Polytechnical, University in Shenzhen, Shenzhen
[3] Institute of Automation, Chinese Academy of Sciences, National Laboratory of Pattern Recognition, Beijing
基金
中国国家自然科学基金;
关键词
cross-modal retrieval; feature extraction; semantic alignments; Text-based person search; video surveillance;
D O I
10.1109/TCSVT.2024.3376373
中图分类号
学科分类号
摘要
Due to the practical significance in smart video surveillance systems, Text-Based Person Search (TBPS) has been one of the research hotspots recently, which refers to searching for the interested pedestrian images given natural language sentences. To help researchers quickly grasp the developments of this important task, we comprehensively summarize the recent research advances of TBPS from two perspectives, i.e., Feature Extraction (FE) and Semantic Alignments (SA). Specifically, the FE mainly consists of pre-processing approaches and end-to-end frameworks, and the SA could be briefly divided into cross-modal attention mechanism, non-attention alignments, training objectives, and generative approaches. Afterwards, we elaborate four widely-used benchmarks and also the evaluation criterion for TBPS. And comparisons and analyses among the state-of-the-art (SOTA) solutions are provided based on these large-scale benchmarks. At last, we point out some future research directions that need to be further addressed, which will greatly facilitate the practical applications of TBPS. © 1991-2012 IEEE.
引用
收藏
页码:7803 / 7819
页数:16
相关论文
共 48 条
  • [21] Part-Based Multi-Scale Attention Network for Text-Based Person Search
    Wang, Yubin
    Qi, Ding
    Zhao, Cairong
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 462 - 474
  • [22] Full-view salient feature mining and alignment for text-based person search
    Xie, Sheng
    Zhang, Canlong
    Ning, Enhao
    Li, Zhixin
    Wang, Zhiwen
    Wei, Chunrong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [23] Learning shared features from specific and ambiguous descriptions for text-based person search
    Ke Cheng
    Qikai Geng
    Shucheng Huang
    Juanjuan Tu
    Hu Lu
    Multimedia Systems, 2024, 30
  • [24] Learning shared features from specific and ambiguous descriptions for text-based person search
    Cheng, Ke
    Geng, Qikai
    Huang, Shucheng
    Tu, Juanjuan
    Lu, Hu
    MULTIMEDIA SYSTEMS, 2024, 30 (02)
  • [25] Fine-grained semantic oriented embedding set alignment for text-based person search
    Zhao, Jiaqi
    Fu, Ao
    Zhou, Yong
    Du, Wen-liang
    Yao, Rui
    IMAGE AND VISION COMPUTING, 2024, 152
  • [26] Text-based Person Search in Full Images via Semantic-Driven Proposal Generation
    Zhang, Shizhou
    Cheng, De
    Luo, Wenlong
    Xing, Yinghui
    Long, Duo
    Li, Hao
    Niu, Kai
    Liang, Guoqiang
    Zhang, Yanning
    PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON HUMAN-CENTRIC MULTIMEDIA ANALYSIS, HCMA 2023, 2023, : 5 - 14
  • [27] VGSG: Vision-Guided Semantic-Group Network for Text-Based Person Search
    He, Shuting
    Luo, Hao
    Jiang, Wei
    Jiang, Xudong
    Ding, Henghui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 163 - 176
  • [28] Enhancing visual representation for text-based person searching
    Shen, Wei
    Fang, Ming
    Wang, Yuxia
    Xiao, Jiafeng
    Li, Diping
    Chen, Huangqun
    Xu, Ling
    Zhang, Weifeng
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [29] PaSeMix: A Multi-modal Partitional Semantic Data Augmentation Method for Text-Based Person Search
    Yuan, Xinpan
    Li, Jiabao
    Gan, Wenguang
    Xia, Wei
    Weng, Yanbin
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 : 468 - 479
  • [30] Multi-granularity relation-aware and conditional query learning for text-based person search
    Wang, Xiaoyong
    Yang, Jianxi
    JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (01)