Person Search by Text Attribute Query as Zero-Shot Learning

被引:33
作者
Dong, Qi [1 ]
Gong, Shaogang [1 ]
Zhu, Xiatian [2 ]
机构
[1] Queen Mary Univ London, London, England
[2] Vis Semant Ltd, London, England
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
“创新英国”项目;
关键词
D O I
10.1109/ICCV.2019.00375
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing person search methods predominantly assume the availability of at least one-shot imagery sample of the queried person. This assumption is limited in circumstances where only a brief textual (or verbal) description of the target person is available. In this work, we present a deep learning method for text attribute description based person search without any query imagery. Whilst conventional cross-modality matching methods, such as global visual-textual embedding based zero-shot learning and local individual attribute recognition, are functionally applicable, they are limited by several assumptions invalid to person search in deployment scale, data quality, and/or category name semantics. We overcome these issues by formulating an Attribute-Image Hierarchical Matching (AIHM) model. It is able to more reliably match text attribute descriptions with noisy surveillance person images by jointly learning global category-level and local attribute-level textual-visual embedding as well as matching. Extensive evaluations demonstrate the superiority of our AIHM model over a wide variety of state-of-the-art methods on three publicly available attribute labelled surveillance person search benchmarks: Market-1501, DukeMTMC, and PA100K.
引用
收藏
页码:3651 / 3660
页数:10
相关论文
共 41 条
[1]  
Andrienko G., 2013, Introduction, P1
[2]  
[Anonymous], 2018, IJCAI
[3]  
[Anonymous], 2017, IEEE C COMP VIS PATT
[4]  
[Anonymous], 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence
[5]  
[Anonymous], P BMVC2018 C
[6]  
[Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124
[7]  
[Anonymous], 2015, WACV
[8]  
[Anonymous], 2012, CVPR
[9]  
[Anonymous], 2017, WACV
[10]   Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association [J].
Chen, Dapeng ;
Li, Hongsheng ;
Liu, Xihui ;
Shen, Yantao ;
Shao, Jing ;
Yuan, Zejian ;
Wang, Xiaogang .
COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :56-73