Person Search by Text Attribute Query as Zero-Shot Learning

被引：33

作者：

Dong, Qi ^{[1
]}

Gong, Shaogang ^{[1
]}

Zhu, Xiatian ^{[2
]}

机构：

[1] Queen Mary Univ London, London, England

[2] Vis Semant Ltd, London, England

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年

基金：

“创新英国”项目;

关键词：

D O I：

10.1109/ICCV.2019.00375

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing person search methods predominantly assume the availability of at least one-shot imagery sample of the queried person. This assumption is limited in circumstances where only a brief textual (or verbal) description of the target person is available. In this work, we present a deep learning method for text attribute description based person search without any query imagery. Whilst conventional cross-modality matching methods, such as global visual-textual embedding based zero-shot learning and local individual attribute recognition, are functionally applicable, they are limited by several assumptions invalid to person search in deployment scale, data quality, and/or category name semantics. We overcome these issues by formulating an Attribute-Image Hierarchical Matching (AIHM) model. It is able to more reliably match text attribute descriptions with noisy surveillance person images by jointly learning global category-level and local attribute-level textual-visual embedding as well as matching. Extensive evaluations demonstrate the superiority of our AIHM model over a wide variety of state-of-the-art methods on three publicly available attribute labelled surveillance person search benchmarks: Market-1501, DukeMTMC, and PA100K.

引用

页码：3651 / 3660

页数：10

共 41 条

[1]

Andrienko G., 2013, Introduction, P1

[2]

[Anonymous], 2018, IJCAI

[3]

[Anonymous], 2017, IEEE C COMP VIS PATT

[4]

[Anonymous], 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence

[5]

[Anonymous], P BMVC2018 C

[6]

[Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124

[7]

[Anonymous], 2015, WACV

[8]

[Anonymous], 2012, CVPR

[9]

[Anonymous], 2017, WACV

[10] Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association [J].

Chen, Dapeng ;

Li, Hongsheng ;

Liu, Xihui ;

Shen, Yantao ;

Shao, Jing ;

Yuan, Zejian ;

Wang, Xiaogang .

COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 :56-73

← 1 2 3 4 5 →