Multi-level Part-aware Feature Disentangling for Text-based Person Search

被引:2
作者
Chen, Yuhao [1 ]
Zhang, Guoqing [1 ]
Zhang, Hongwei [1 ]
Zheng, Yuhui [1 ]
Lin, Weisi [2 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing, Peoples R China
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
来源
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME | 2023年
基金
中国国家自然科学基金;
关键词
Image retrieval; Cross-modality; Representation learning; Person search;
D O I
10.1109/ICME55011.2023.00476
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-based person search is an important sub-task in cross-modality image retrieval, aiming to capture interested person images by giving textual descriptions. The huge information differences between image and text modalities make this task challenging. Recent methods take local-aligned feature learning strategy into consideration, but lack sufficient mining of more local information. Accordingly, we explore a Multi-level Part-aware Feature Disentangling (MPFD) framework to more fully extract visual and textual representations from multiple angles. Specifically, we introduce a Textual Part-aware Matching (TPM) module into the existing baseline, to disentangle local features for detailed information from both visual and textual part-aware aspects. Besides, in order to fuse multiple local features and improve discrimination of global features, we propose a Multi-level Feature Integration (MFI) module which is capable to perceive the relations between features. We carry out adequate experiments on CUHK-PEDES and ICFG-PEDES datasets to verify our proposed framework, and the results demonstrate that MPFD framework performs favorably against the state-of-the-art methods.
引用
收藏
页码:2801 / 2806
页数:6
相关论文
共 23 条
  • [1] Aggarwal S, 2020, IEEE WINT CONF APPL, P2606, DOI [10.1109/wacv45572.2020.9093640, 10.1109/WACV45572.2020.9093640]
  • [2] [Anonymous], 2021, arXiv, DOI 10.48550/
  • [3] [Anonymous], 2021, IJCAI
  • [4] Chen, 2018, ECCV
  • [5] Chen, 2022, NEUROCOMPUTING
  • [6] Improving Text-based Person Search by Spatial Matching and Adaptive Threshold
    Chen, Tianlang
    Xu, Chenliang
    Luo, Jiebo
    [J]. 2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1879 - 1887
  • [7] Farooq A, 2022, AAAI CONF ARTIF INTE, P4477
  • [8] Jing, 2020, AAAI
  • [9] Li, 2017, ICCV
  • [10] Person Search with Natural Language Description
    Li, Shuang
    Xiao, Tong
    Li, Hongsheng
    Zhou, Bolei
    Yue, Dayu
    Wang, Xiaogang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5187 - 5196