Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition

被引:24
作者
Huang, Tianhuan [1 ]
Ben, Xianye [1 ]
Gong, Chen [2 ]
Zhang, Baochang [3 ]
Yan, Rui [4 ]
Wu, Qiang [5 ]
机构
[1] Shandong Univ, Sch Informat Sci & Engn, Qingdao 266237, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Key Lab Intelligent Percept & Syst High Dimens In, Minist Educ, Nanjing 210094, Peoples R China
[3] Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China
[4] Microsoft Corp, Bellevue, WA 98004 USA
[5] Univ Technol Sydney, Sch Elect & Data Engn, Sydney, NSW 2007, Australia
关键词
Gait recognition; cross view; spatial-temporal enhance; multi-scale salient feature extraction; UNIFIED FRAMEWORK;
D O I
10.1109/TCSVT.2022.3175959
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Gait recognition can be used in person identification and re-identification by itself or in conjunction with other biometrics. Although gait has both spatial and temporal attributes, and it has been observed that decoupling spatial feature and temporal feature can better exploit the gait feature on the fine-grained level. However, the spatial-temporal correlations of gait video signals are also lost in the decoupling process. Direct 3D convolution approaches can retain such correlations, but they also introduce unnecessary interferences. Instead of common 3D convolution solutions, this paper proposes an integration of decoupling process into a 3D convolution framework for cross-view gait recognition. In particular, a novel block consisting of a Parallel-insight Convolution layer integrated with a Spatial-Temporal Dual-Attention (STDA) unit is proposed as the basic block for global spatial-temporal information extraction. Under the guidance of the STDA unit, this block can well integrate spatial-temporal information extracted by two decoupled models and at the same time retain the spatial-temporal correlations. In addition, a Multi-Scale Salient Feature Extractor is proposed to further exploit the fine-grained features through context awareness extension of part-based features and adaptively aggregating the spatial features. Extensive experiments on three popular gait datasets, namely CASIA-B, OULP and OUMVLP, demonstrate that the proposed method outperforms state-of-the-art methods.
引用
收藏
页码:6967 / 6980
页数:14
相关论文
共 60 条
[1]   Covariate Conscious Approach for Gait Recognition Based Upon Zernike Moment Invariants [J].
Aggarwal, Himanshu ;
Vishwakarma, Dinesh Kumar .
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2018, 10 (02) :397-407
[2]  
Ariyanto G., 2011, INT JOINT C BIOM, P1
[3]   Coupled Bilinear Discriminant Projection for Cross-View Gait Recognition [J].
Ben, Xianye ;
Gong, Chen ;
Zhang, Peng ;
Yan, Rui ;
Wu, Qiang ;
Meng, Weixiao .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (03) :734-747
[4]   A general tensor representation framework for cross-view gait recognition [J].
Ben, Xianye ;
Zhang, Peng ;
Lai, Zhihui ;
Yan, Rui ;
Zhai, Xinliang ;
Meng, Weixiao .
PATTERN RECOGNITION, 2019, 90 :87-98
[5]   Coupled Patch Alignment for Matching Cross-View Gaits [J].
Ben, Xianye ;
Gong, Chen ;
Zhang, Peng ;
Jia, Xitong ;
Wu, Qiang ;
Meng, Weixiao .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :3142-3157
[6]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[7]   GaitSet: Cross-View Gait Recognition Through Utilizing Gait As a Deep Set [J].
Chao, Hanqing ;
Wang, Kun ;
He, Yiwei ;
Zhang, Junping ;
Feng, Jianfeng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) :3467-3478
[8]   Multi-View Gait Image Generation for Cross-View Gait Recognition [J].
Chen, Xin ;
Luo, Xizhao ;
Weng, Jian ;
Luo, Weiqi ;
Li, Huiting ;
Tian, Qi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3041-3055
[9]   Improving the Harmony of the Composite Image by Spatial-Separated Attention Module [J].
Cun, Xiaodong ;
Pun, Chi-Man .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4759-4771
[10]   View-Invariant Deep Architecture for Human Action Recognition Using Two-Stream Motion and Shape Temporal Dynamics [J].
Dhiman, Chhavi ;
Vishwakarma, Dinesh Kumar .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3835-3844