Exploring attribute localization and correlation for pedestrian attribute recognition

被引:14
作者
Weng, Dunfang [1 ]
Tan, Zichang [2 ,3 ]
Fang, Liwei [1 ]
Guo, Guodong [2 ,3 ]
机构
[1] AI PRIME, Dept Algorithm, Shanghai, Peoples R China
[2] Inst Deep Learning, Baidu Res, Beijing, Peoples R China
[3] Natl Engn Lab Deep Learning Technol & Applicat, Beijing, Peoples R China
关键词
Pedestrian attribute recognition; Transformer; Attention; Deep learning; PERSON REIDENTIFICATION; FEATURES; NETWORK;
D O I
10.1016/j.neucom.2023.02.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pedestrian Attribute Recognition (PAR) is currently an emerging research topic in the field of video surveillance. For PAR, it usually needs to analyze dozens of attributes simultaneously, e.g., age, gender and Clothing type. However, different attributes may focus on different image regions, which makes it difficult to concurrently extract exhaustive features over all attributes. Moreover, some of these attri-butes are highly correlated, which is the other challenge for pedestrian attribute recognition. To remedy the aforementioned two issues, we propose two novel modules, namely Attribute Localization Module (ALM) and Attribute Correlation Module (ACM). For ALM, it is constructed based on a multi-stream archi-tecture with each stream processing a specific attribute individually. More specifically, an attention mechanism is employed to discover and enhance the attribute-related features while suppressing less important regions. For ACM, the Transformer structure is employed to effectively explore the correlations among different attributes. In particular, we place the Transformer blocks behind the ALM module, with regarding each attribute-specific feature as an input token. The ALM and ACM modules focus on different aspects, which exploits the interrelated and complementary information. We combine the proposed modules to form a unified network with Exploring Attribute Localization and Correlation (abbreviated as EALC). Our approach is validated on five large-scale pedestrian attribute datasets, including PETA, RAP, PA-100 K, Market-1501 and Duke attribute datasets. Experiments demonstrate the effectiveness and advancement of the proposed EALC.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:140 / 150
页数:11
相关论文
共 66 条
[61]  
Zeng H., 2020, ICME, P1
[62]  
Zhang H, 2020, Arxiv, DOI [arXiv:2004.08955, DOI 10.48550/ARXIV.2004.08955]
[63]  
Zhang JJ, 2020, Arxiv, DOI arXiv:2011.06798
[64]  
Zhang SY, 2019, PR MACH LEARN RES, V97
[65]  
Zhao X, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P3177
[66]  
Zhao X, 2019, AAAI CONF ARTIF INTE, P9275