DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

被引：110

作者：

Zhu, Aichun ^{[1
]}

Wang, Zijie ^{[1
]}

Li, Yifeng ^{[1
]}

Wan, Xili ^{[1
]}

Jin, Jing ^{[1
]}

Wang, Tian ^{[2
]}

Hu, Fangqiang ^{[1
]}

Hua, Gang ^{[3
]}

机构：

[1] Nanjing Tech Univ, Nanjing, Peoples R China

[2] Beihang Univ, Beijing, Peoples R China

[3] China Univ Min & Technol, Xuzhou, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

person retrieval; text-based person re-identification; cross-modal retrieval; surroundings-person separation;

D O I：

10.1145/3474085.3475369

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality. Nevertheless, due to the complexity of high-dimensional data, the unconstrained mapping paradigms are not able to properly catch discriminative clues about the corresponding person while drop the misaligned information. Intuitively, the information contained in visual data can be divided into person information (PI) and surroundings information (SI), which are mutually exclusive from each other. To this end, we propose a novel Deep Surroundings-person Separation Learning (DSSL) model in this paper to effectively extract and match person information, and hence achieve a superior retrieval accuracy. A surroundings-person separation and fusion mechanism plays the key role to realize an accurate and effective surroundings-person separation under a mutually exclusion constraint. In order to adequately utilize multimodal and multi-granular information for a higher retrieval accuracy, five diverse alignment paradigms are adopted. Extensive experiments are carried out to evaluate the proposed DSSL on CUHK-PEDES, which is currently the only accessible dataset for text-base person retrieval task. DSSL achieves the state-of-the-art performance on CUHK-PEDES. To properly evaluate our proposed DSSL in the real scenarios, a Real Scenarios Text-based Person Reidentification (RSTPReid) dataset is constructed to benefit future research on text-based person retrieval, which will be publicly available.

引用

页码：209 / 217

页数：9

共 50 条

[1] SUM: Serialized Updating and Matching for text-based person retrieval
Wang, Zijie
Zhu, Aichun
Xue, Jingyi
Jiang, Daihong
Liu, Chao
Li, Yifeng
Hu, Fangqiang
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[2] Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold
Wang, Zijie
Zhu, Aichun
Xue, Jingyi
Wan, Xili
Liu, Chao
Wang, Tian
Li, Yifeng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1984 - 1992
[3] Fine-grained Semantics-aware Representation Learning for Text-based Person Retrieval
Wang, Di
Yan, Feng
Wang, Yifeng
Zhao, Lin
Liang, Xiao
Zhong, Haodi
Zhang, Ronghua
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 92 - 100
[4] Pedestrian-specific Bipartite-aware Similarity Learning for Text-based Person Retrieval
Shen, Fei
Shu, Xiangbo
Du, Xiaoyu
Tang, Jinhui
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8922 - 8931
[5] EESSO: Exploiting Extreme and Smooth Signals via Omni-frequency learning for Text-based Person Retrieval
Xue, Jingyi
Wang, Zijie
Dong, Guan-Nan
Zhu, Aichun
IMAGE AND VISION COMPUTING, 2024, 142
[6] Enhancing visual representation for text-based person searching
Shen, Wei
Fang, Ming
Wang, Yuxia
Xiao, Jiafeng
Li, Diping
Chen, Huangqun
Xu, Ling
Zhang, Weifeng
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[7] Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color
Zhu, Aichun
Wang, Zijie
Xue, Jingyi
Wan, Xili
Jin, Jing
Wang, Tian
Snoussi, Hichem
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
[8] Cross-Modal Uncertainty Modeling With Diffusion-Based Refinement for Text-Based Person Retrieval
Li, Shenshen
Xu, Xing
He, Chen
Shen, Fumin
Yang, Yang
Shen, Heng Tao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) : 2881 - 2893
[9] Local-enhanced representation for text-based person search
Zhang, Guoqing
Chen, Yuhao
Zheng, Yuhui
Martin, Gaven
Wang, Ruili
PATTERN RECOGNITION, 2025, 161
[10] Text-Guided Visual Feature Refinement for Text-Based Person Search
Gao, Liying
Niu, Kai
Ma, Zehong
Jiao, Bingliang
Tan, Tonghao
Wang, Peng
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 118 - 126

← 1 2 3 4 5 →