Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color

被引：2

作者：

Zhu, Aichun ^{[1
]}

Wang, Zijie ^{[1
]}

Xue, Jingyi ^{[1
]}

Wan, Xili ^{[1
]}

Jin, Jing ^{[1
]}

Wang, Tian ^{[2
]}

Snoussi, Hichem ^{[3
]}

机构：

[1] Nanjing Tech Univ, Coll Comp & Informat Engn, Nanjing 211816, Peoples R China

[2] Beihang Univ, Inst Artificial Intelligence, Zhongguancun Lab, SKLCCSE, Beijing 100191, Peoples R China

[3] Univ Technol Troyes, Inst Charles Delaunay, LM2S FRE CNRS 2019, F-10004 Troyes, France

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Task analysis; Image color analysis; Visualization; Semantics; Data models; Pedestrians; Learning systems; Color (CLR) information; cross-modal retrieval; frequency; person reidentification (ReID); text-based person retrieval; NETWORK;

D O I：

10.1109/TNNLS.2024.3368217

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text-based person retrieval is the process of searching a massive visual resource library for images of a particular pedestrian, based on a textual query. Existing approaches often suffer from a problem of color (CLR) over-reliance, which can result in a suboptimal person retrieval performance by distracting the model from other important visual cues such as texture and structure information. To handle this problem, we propose a novel framework to Excavate All-round Information Beyond Color for the task of text-based person retrieval, which is therefore termed EAIBC. The EAIBC architecture includes four branches, namely an RGB branch, a grayscale (GRS) branch, a high-frequency (HFQ) branch, and a CLR branch. Furthermore, we introduce a mutual learning (ML) mechanism to facilitate communication and learning among the branches, enabling them to take full advantage of all-round information in an effective and balanced manner. We evaluate the proposed method on three benchmark datasets, including CUHK-PEDES, ICFG-PEDES, and RSTPReid. The experimental results demonstrate that EAIBC significantly outperforms existing methods and achieves state-of-the-art (SOTA) performance in supervised, weakly supervised, and cross-domain settings.

引用

页码：1 / 15

页数：15

共 14 条

[11] Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold
Wang, Zijie
Zhu, Aichun
Xue, Jingyi
Wan, Xili
Liu, Chao
Wang, Tian
Li, Yifeng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1984 - 1992
[12] Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Ma, Yiwei
Sun, Xiaoshuai
Ji, Jiayi
Jiang, Guannan
Zhuang, Weilin
Ji, Rongrong
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4157 - 4168
[13] Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
Yang, Shuyu
Zhou, Yinan
Zheng, Zhedong
Wang, Yaxiong
Zhu, Li
Wu, Yujiao
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4492 - 4501
[14] ASPD-Net: Self-aligned part mask for improving text-based person re-identification with adversarial representation learning
Wang, Zijie
Xue, Jingyi
Wan, Xili
Zhu, Aichun
Li, Yifeng
Zhu, Xiaomei
Hu, Fangqiang
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116

← 1 2 →