PMG-Pyramidal Multi-Granular Matching for Text-Based Person Re-Identification

被引:1
|
作者
Liu, Chao [1 ]
Xue, Jingyi [2 ]
Wang, Zijie [2 ]
Zhu, Aichun [2 ]
机构
[1] Jinling Inst Technol, Sch Intelligent Sci & Control Engn, Nanjing 211199, Peoples R China
[2] Nanjing Tech Univ, Sch Comp Sci & Technol, Nanjing 211816, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 21期
关键词
text-based person retrieval; person re-identification; multi-granular matching;
D O I
10.3390/app132111876
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Given a textual query, text-based person re-identification is supposed to search for the targeted pedestrian images from a large-scale visual database. Due to the inherent heterogeneity between different modalities, it is challenging to measure the cross-modal affinity between visual and textual data. Existing works typically employ single-granular methods to extract local features and align image regions with relevant words/phrases. Nevertheless, the limited robustness of single-granular methods cannot adapt to the imprecision and variances of visual and textual features, which are usually influenced by the background clutter, position transformation, posture diversity, and occlusion in surveillance videos, thereby leading to the deterioration of cross-modal matching accuracy. In this paper, we propose a Pyramidal Multi-Granular matching network (PMG) that incorporates a gradual transition process between the coarsest global information and the finest local information by a coarse-to-fine pyramidal method for multi-granular cross-modal features extraction and affinities learning. For each body part of a pedestrian, PMG is adequate in ensuring the integrity of local information while minimizing the surrounding interference signals at a certain scale and can adapt to capture discriminative signals of different body parts and achieve semantically alignment between image strips with relevant textual descriptions, thus suppressing the variances of feature extraction and improving the robustness of feature matching. Comprehensive experiments are conducted on the CUHK-PEDES and RSTPReid datasets to validate the effectiveness of the proposed method and results show that PMG outperforms state-of-the-art (SOTA) methods significantly and yields competitive accuracy of cross-modal retrieval.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Relation network based on multi-granular hypergraphs for person re-identification
    Chenchen Guo
    Xiaoming Zhao
    Qiang Zou
    Applied Intelligence, 2022, 52 : 11394 - 11406
  • [2] Relation network based on multi-granular hypergraphs for person re-identification
    Guo, Chenchen
    Zhao, Xiaoming
    Zou, Qiang
    APPLIED INTELLIGENCE, 2022, 52 (10) : 11394 - 11406
  • [3] MINING FALSE POSITIVE EXAMPLES FOR TEXT-BASED PERSON RE-IDENTIFICATION
    Xu, Wenhao
    Shao, Zhiyin
    Ding, Changxing
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1680 - 1684
  • [4] Parallel Data Augmentation for Text-based Person Re-identification
    Cai, Han-Qing
    Li, Xin
    Ji, Yi
    Li, Ying
    Liu, Chun-Ping
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [5] Resource-efficient Text-based Person Re-identification on Embedded Devices
    Agyeman, Rockson
    Rinner, Bernhard
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 84 - 92
  • [6] Multi-level cross-modality learning framework for text-based person re-identification
    Wu, Tinghui
    Zhang, Shuhe
    Chen, Dihu
    Hu, Haifeng
    ELECTRONICS LETTERS, 2023, 59 (20)
  • [7] From attributes to natural language: A survey and foresight on text-based person re-identification
    Jiang, Fanzhi
    Yang, Su
    Jones, Mark W.
    Zhang, Liumei
    INFORMATION FUSION, 2025, 118
  • [8] Multi-patch matching for Person Re-identification
    Labidi, Hocine
    Luo, Sen-Lin
    Boubekeur, Mohamed Bachir
    Benlefki, Tarek
    2015 INTERNATIONAL CONFERENCE ON OPTICAL INSTRUMENTS AND TECHNOLOGY: OPTOELECTRONIC IMAGING AND PROCESSING TECHNOLOGY, 2015, 9622
  • [9] Dual-path CNN with Max Gated block for text-based person re-identification
    Ma, Tinghuai
    Yang, Mingming
    Rong, Huan
    Qian, Yurong
    Tian, Yuan
    Al-Nabhan, Najla
    IMAGE AND VISION COMPUTING, 2021, 111
  • [10] Deep Pyramidal Pooling With Attention for Person Re-Identification
    Martinel, Niki
    Foresti, Gian Luca
    Micheloni, Christian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 7306 - 7316