Fusion-Attention Network for person search with free-form natural language

被引:18
作者
Ji, Zhong [1 ]
Li, Shengjia [1 ]
Pang, Yanwei [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
基金
中国国家自然科学基金;
关键词
Person search; Natural language description; Attention network;
D O I
10.1016/j.patrec.2018.10.020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the task of searching persons from surveillance videos or large scale image dataset, it is more challenging to utilize free-form natural language to retrieve persons than using images and attributes. Thus, to deal with the challenges brought from the complexity of free-from natural language and visual-description mapping, we propose to strengthen the role of textual descriptions by means of fusion and attention mechanisms to make the discriminative words visually sensitive. Specifically, we develop an end-to-end fusion-attention structure, called Description-Strengthened Fusion-Attention Network (DSFA-Net) to tackle the challenging task. Specifically, DSFA-Net has a fusion sub-network and an attention sub-network, where three attention mechanisms are applied. Extensive experiments are performed on the large-scale CUHK-PEDES, which demonstrate the superiority of DSFA-Net. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:205 / 211
页数:7
相关论文
共 9 条
  • [1] Multimodal Alignment and Attention-Based Person Search via Natural Language Description
    Ji, Zhong
    Li, Shengjia
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 11147 - 11156
  • [2] Multilevel Collaborative Attention Network for Person Search
    Li, Wenbo
    Chen, Ze
    Fu, Zhenyong
    Lu, Hongtao
    COMPUTER VISION - ACCV 2018, PT I, 2019, 11361 : 467 - 482
  • [3] Scale Voting With Pyramidal Feature Fusion Network for Person Search
    Hong, Zheran
    Liu, Bin
    Lu, Yan
    Yin, Guojun
    Yu, Nenghai
    IEEE ACCESS, 2019, 7 : 139692 - 139702
  • [4] Cross-scale global attention feature pyramid network for person search
    Li, Yang
    Xu, Huahu
    Bian, Minjie
    Xiao, Junsheng
    IMAGE AND VISION COMPUTING, 2021, 116
  • [5] Adversarial Attribute-Text Embedding for Person Search With Natural Language Query
    Zha, Zheng-Jun
    Liu, Jiawei
    Chen, Di
    Wu, Feng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1836 - 1846
  • [6] Multi-Attention-Guided Cascading Network for End-to-End Person Search
    Yang, Jianxi
    Wang, Xiaoyong
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [7] DAAPS: A Deformable-Attention-Based Anchor-Free Person Search Model
    Xin, Xiaoqi
    Han, Dezhi
    Cui, Mingming
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 77 (02): : 2407 - 2425
  • [8] PS-ARM: An End-to-End Attention-Aware Relation Mixer Network for Person Search
    Fiaz, Mustansar
    Cholakkal, Hisham
    Narayan, Sanath
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    COMPUTER VISION - ACCV 2022, PT V, 2023, 13845 : 234 - 250
  • [9] GPAN-PS: Global-Response Pedestrian Attention Network for End-to-End Person Search
    Zheng, Linlin
    Han, Dezhi
    Xin, Xiaoqi
    IEEE ACCESS, 2024, 12 : 157686 - 157698