DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

被引:111
|
作者
Zhu, Aichun [1 ]
Wang, Zijie [1 ]
Li, Yifeng [1 ]
Wan, Xili [1 ]
Jin, Jing [1 ]
Wang, Tian [2 ]
Hu, Fangqiang [1 ]
Hua, Gang [3 ]
机构
[1] Nanjing Tech Univ, Nanjing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
[3] China Univ Min & Technol, Xuzhou, Jiangsu, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
person retrieval; text-based person re-identification; cross-modal retrieval; surroundings-person separation;
D O I
10.1145/3474085.3475369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality. Nevertheless, due to the complexity of high-dimensional data, the unconstrained mapping paradigms are not able to properly catch discriminative clues about the corresponding person while drop the misaligned information. Intuitively, the information contained in visual data can be divided into person information (PI) and surroundings information (SI), which are mutually exclusive from each other. To this end, we propose a novel Deep Surroundings-person Separation Learning (DSSL) model in this paper to effectively extract and match person information, and hence achieve a superior retrieval accuracy. A surroundings-person separation and fusion mechanism plays the key role to realize an accurate and effective surroundings-person separation under a mutually exclusion constraint. In order to adequately utilize multimodal and multi-granular information for a higher retrieval accuracy, five diverse alignment paradigms are adopted. Extensive experiments are carried out to evaluate the proposed DSSL on CUHK-PEDES, which is currently the only accessible dataset for text-base person retrieval task. DSSL achieves the state-of-the-art performance on CUHK-PEDES. To properly evaluate our proposed DSSL in the real scenarios, a Real Scenarios Text-based Person Reidentification (RSTPReid) dataset is constructed to benefit future research on text-based person retrieval, which will be publicly available.
引用
收藏
页码:209 / 217
页数:9
相关论文
共 50 条
  • [41] Person search over security video surveillance systems using deep learning methods: A review
    Irene, S.
    Prakash, A. John
    Uthariaraj, V. Rhymend
    IMAGE AND VISION COMPUTING, 2024, 143
  • [42] Text-to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network
    Han, Guang
    Lin, Min
    Li, Ziyang
    Zhao, Haitao
    Kwong, Sam
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6025 - 6036
  • [43] DHML: Deep Heterogeneous Metric Learning for VIS-NIR Person Re-identification
    Zhang, Quan
    Cheng, Haijie
    Lai, Jianhuang
    Xie, Xiaohua
    BIOMETRIC RECOGNITION (CCBR 2019), 2019, 11818 : 455 - 465
  • [44] Bottom-up color-independent alignment learning for text-image person re-identification
    Du, Guodong
    Zhu, Hanyue
    Zhang, Liyan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [45] Image-text bidirectional learning network based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Gu, Guanghua
    NEUROCOMPUTING, 2022, 483 : 148 - 159
  • [46] Deep Multi-Modal Metric Learning with Multi-Scale Correlation for Image-Text Retrieval
    Hua, Yan
    Yang, Yingyun
    Du, Jianhe
    ELECTRONICS, 2020, 9 (03)
  • [47] Learning Text-image Joint Embedding for Efficient Cross-modal Retrieval with Deep Feature Engineering
    Xie, Zhongwei
    Liu, Ling
    Wu, Yanzhao
    Zhong, Luo
    Li, Lin
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2022, 40 (04)
  • [48] Deep-Learning-based Cross-Modal Luxury Microblogs Retrieval
    Menghao, Ma
    Liu, Wuying
    Feng, Wenhe
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 90 - 94
  • [49] Zero-Shot Cross Modal Retrieval Method Based on Deep Supervised Learning
    Zeng S.
    Pang S.
    Hao W.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2022, 56 (11): : 156 - 166
  • [50] Deep Semantic Correlation Learning based Hashing for Multimedia Cross-Modal Retrieval
    Gong, Xiaolong
    Huang, Linpeng
    Wang, Fuwei
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 117 - 126