Adversarial Attribute-Text Embedding for Person Search With Natural Language Query

被引:43
作者
Zha, Zheng-Jun [1 ]
Liu, Jiawei [1 ]
Chen, Di [1 ]
Wu, Feng [1 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Peoples R China
基金
中国国家自然科学基金;
关键词
Visualization; Natural languages; Feature extraction; Cameras; Semantics; Task analysis; Robustness; Person search; natural language; adversarial learning; visual attributes; graph convolution network; REIDENTIFICATION;
D O I
10.1109/TMM.2020.2972168
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The newly emerging task of person search with natural language query aims at retrieving the target pedestrian by a text description of the pedestrian. It is more applicable compared to person search with image/video query, i.e., person re-identification. In this paper, we propose a novel Adversarial Attribute-Text Embedding (AATE) network for person search with text query. In particular, a cross-modal adversarial learning module is proposed to learn discriminative and modality-invariant visual-textual features. It consists of a cross-modal learner and a modality discriminator, playing a min-max game in an adversarial learning way. The former is to improve intra-modality discrimination and inter-modality invariance towards confusing the modality discriminator. The latter is to distinguish the features from different modalities and boost the learning of modality-invariant features. Moreover, a visual attribute graph convolutional network is proposed to learn visual attributes of pedestrians, which possess better descriptiveness, interpretability and robustness compared to pedestrian appearance features. A hierarchical text embedding network, consisting of multi-stacked bidirectional LSTMs and a textual attention block, is developed to extract effective textual features from text descriptions of pedestrians. Extensive experimental results on two challenging benchmarks, have demonstrated the effectiveness of the proposed approach.
引用
收藏
页码:1836 / 1846
页数:11
相关论文
共 11 条
  • [1] Address the Unseen Relationships: Attribute Correlations in Text Attribute Person Search
    Yang, Xi
    Wang, Xiaoqi
    Wang, Nannan
    Gao, Xinbo
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16916 - 16926
  • [2] Multimodal Alignment and Attention-Based Person Search via Natural Language Description
    Ji, Zhong
    Li, Shengjia
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 11147 - 11156
  • [3] Text-based Person Search via Virtual Attribute Learning
    Wang C.-J.
    Su J.-W.
    Luo Z.-M.
    Cao D.-L.
    Lin Y.-J.
    Li S.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2035 - 2050
  • [4] AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text
    Shao, Zhihong
    Wu, Zhongqin
    Huang, Minlie
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1184 - 1196
  • [5] Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
    Yang, Shuyu
    Zhou, Yinan
    Zheng, Zhedong
    Wang, Yaxiong
    Zhu, Li
    Wu, Yujiao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4492 - 4501
  • [6] Deep Adversarial Graph Attention Convolution Network for Text-Based Person Search
    Liu, Jiawei
    Zha, Zheng-Jun
    Hong, Richang
    Wang, Meng
    Zhang, Yongdong
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 665 - 673
  • [7] Improving Cross-Modal Constraints: Text Attribute Person Search With Graph Attention Networks
    Yang, Xi
    Wang, Xiaoqi
    Yang, Dong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2493 - 2503
  • [8] An alternative approach to natural language query expansion in search engines: Text analysis of non-topical terms in Web documents
    Fattahi, Rahmatollah
    Wilson, Concepcion S.
    Cole, Fletcher
    INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (04) : 1503 - 1516
  • [9] AMEN: Adversarial Multi-space Embedding Network for Text-Based Person Re-identification
    Wang, Zijie
    Xue, Jingyi
    Zhu, Aichun
    Li, Yifeng
    Zhang, Mingyi
    Zhong, Chongliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 462 - 473
  • [10] Fusion-Attention Network for person search with free-form natural language
    Ji, Zhong
    Li, Shengjia
    Pang, Yanwei
    PATTERN RECOGNITION LETTERS, 2018, 116 : 205 - 211