DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

被引:111
|
作者
Zhu, Aichun [1 ]
Wang, Zijie [1 ]
Li, Yifeng [1 ]
Wan, Xili [1 ]
Jin, Jing [1 ]
Wang, Tian [2 ]
Hu, Fangqiang [1 ]
Hua, Gang [3 ]
机构
[1] Nanjing Tech Univ, Nanjing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
[3] China Univ Min & Technol, Xuzhou, Jiangsu, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
person retrieval; text-based person re-identification; cross-modal retrieval; surroundings-person separation;
D O I
10.1145/3474085.3475369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality. Nevertheless, due to the complexity of high-dimensional data, the unconstrained mapping paradigms are not able to properly catch discriminative clues about the corresponding person while drop the misaligned information. Intuitively, the information contained in visual data can be divided into person information (PI) and surroundings information (SI), which are mutually exclusive from each other. To this end, we propose a novel Deep Surroundings-person Separation Learning (DSSL) model in this paper to effectively extract and match person information, and hence achieve a superior retrieval accuracy. A surroundings-person separation and fusion mechanism plays the key role to realize an accurate and effective surroundings-person separation under a mutually exclusion constraint. In order to adequately utilize multimodal and multi-granular information for a higher retrieval accuracy, five diverse alignment paradigms are adopted. Extensive experiments are carried out to evaluate the proposed DSSL on CUHK-PEDES, which is currently the only accessible dataset for text-base person retrieval task. DSSL achieves the state-of-the-art performance on CUHK-PEDES. To properly evaluate our proposed DSSL in the real scenarios, a Real Scenarios Text-based Person Reidentification (RSTPReid) dataset is constructed to benefit future research on text-based person retrieval, which will be publicly available.
引用
收藏
页码:209 / 217
页数:9
相关论文
共 50 条
  • [31] Suspicious Person Retrieval from UAV-sensors based on part level deep features
    Bouhlel, Fatma
    Mliki, Hazar
    Hammami, Mohamed
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 318 - 327
  • [32] Person Retrieval in Surveillance Videos Via Deep Attribute Mining and Reasoning
    Shi, Yuxuan
    Wei, Zhen
    Ling, Hefei
    Wang, Ziyang
    Shen, Jialie
    Li, Ping
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 4376 - 4387
  • [33] Stargan-based camera style transfer for person retrieval
    Wang Y.
    Wang Z.
    Jiang M.
    International Journal of Information and Communication Technology, 2020, 16 (01): : 1 - 16
  • [34] Visual appearance based person retrieval in unconstrained environment videos
    Galiyawala, Hiren
    Raval, Mehul S.
    Dave, Shivansh
    IMAGE AND VISION COMPUTING, 2019, 92
  • [35] Uncertainty-aware coarse-to-fine alignment for text-image person retrieval
    Yifei Deng
    Zhengyu Chen
    Chenglong Li
    Jin Tang
    Visual Intelligence, 2025, 3 (1):
  • [36] Attribute based spatio-temporal person retrieval in video surveillance
    Shoitan, Rasha
    Moussa, Mona M.
    El Nemr, Heba A.
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 63 : 441 - 454
  • [37] Attribute based spatio-temporal person retrieval in video surveillance
    Shoitan, Rasha
    Moussa, Mona M.
    El Nemr, Heba A.
    ALEXANDRIA ENGINEERING JOURNAL, 2023, 63 : 441 - 454
  • [38] BCRA: bidirectional cross-modal implicit relation reasoning and aligning for text-to-image person retrieval
    Li, Zhaoqi
    Xie, Yongping
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [39] FastPR: One-stage Semantic Person Retrieval via Self-supervised Learning
    Sun, Meng
    Ren, Ju
    Wang, Xin
    Zhu, Wenwu
    Zhang, Yaoxue
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3628 - 3636
  • [40] Frame Extraction Person Retrieval Framework Based on Improved YOLOv8s and the Stage-Wise Clustering Person Re-Identification
    Zhuang, Jianjun
    Wang, Nan
    Zhuang, Yuchen
    Hao, Yong
    IET IMAGE PROCESSING, 2025, 19 (01)