Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color

被引:2
|
作者
Zhu, Aichun [1 ]
Wang, Zijie [1 ]
Xue, Jingyi [1 ]
Wan, Xili [1 ]
Jin, Jing [1 ]
Wang, Tian [2 ]
Snoussi, Hichem [3 ]
机构
[1] Nanjing Tech Univ, Coll Comp & Informat Engn, Nanjing 211816, Peoples R China
[2] Beihang Univ, Inst Artificial Intelligence, Zhongguancun Lab, SKLCCSE, Beijing 100191, Peoples R China
[3] Univ Technol Troyes, Inst Charles Delaunay, LM2S FRE CNRS 2019, F-10004 Troyes, France
基金
中国国家自然科学基金;
关键词
Task analysis; Image color analysis; Visualization; Semantics; Data models; Pedestrians; Learning systems; Color (CLR) information; cross-modal retrieval; frequency; person reidentification (ReID); text-based person retrieval; NETWORK;
D O I
10.1109/TNNLS.2024.3368217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-based person retrieval is the process of searching a massive visual resource library for images of a particular pedestrian, based on a textual query. Existing approaches often suffer from a problem of color (CLR) over-reliance, which can result in a suboptimal person retrieval performance by distracting the model from other important visual cues such as texture and structure information. To handle this problem, we propose a novel framework to Excavate All-round Information Beyond Color for the task of text-based person retrieval, which is therefore termed EAIBC. The EAIBC architecture includes four branches, namely an RGB branch, a grayscale (GRS) branch, a high-frequency (HFQ) branch, and a CLR branch. Furthermore, we introduce a mutual learning (ML) mechanism to facilitate communication and learning among the branches, enabling them to take full advantage of all-round information in an effective and balanced manner. We evaluate the proposed method on three benchmark datasets, including CUHK-PEDES, ICFG-PEDES, and RSTPReid. The experimental results demonstrate that EAIBC significantly outperforms existing methods and achieves state-of-the-art (SOTA) performance in supervised, weakly supervised, and cross-domain settings.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 14 条
  • [11] Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold
    Wang, Zijie
    Zhu, Aichun
    Xue, Jingyi
    Wan, Xili
    Liu, Chao
    Wang, Tian
    Li, Yifeng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1984 - 1992
  • [12] Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
    Ma, Yiwei
    Sun, Xiaoshuai
    Ji, Jiayi
    Jiang, Guannan
    Zhuang, Weilin
    Ji, Rongrong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4157 - 4168
  • [13] Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
    Yang, Shuyu
    Zhou, Yinan
    Zheng, Zhedong
    Wang, Yaxiong
    Zhu, Li
    Wu, Yujiao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4492 - 4501
  • [14] ASPD-Net: Self-aligned part mask for improving text-based person re-identification with adversarial representation learning
    Wang, Zijie
    Xue, Jingyi
    Wan, Xili
    Zhu, Aichun
    Li, Yifeng
    Zhu, Xiaomei
    Hu, Fangqiang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116