Learning Gait Representations with Noisy Multi-Task Learning

被引:8
作者
Cosma, Adrian [1 ]
Radoi, Emilian [1 ]
机构
[1] Univ Politehn Bucuresti, Fac Automat Control & Comp Sci, Bucharest 006042, Romania
关键词
gait recognition; self-supervised learning; pose estimation; multi-task learning; weakly-supervised learning; OLDER-ADULTS; RECOGNITION; AGE; PERFORMANCE; PATTERNS; IMAGE;
D O I
10.3390/s22186803
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Gait analysis is proven to be a reliable way to perform person identification without relying on subject cooperation. Walking is a biometric that does not significantly change in short periods of time and can be regarded as unique to each person. So far, the study of gait analysis focused mostly on identification and demographics estimation, without considering many of the pedestrian attributes that appearance-based methods rely on. In this work, alongside gait-based person identification, we explore pedestrian attribute identification solely from movement patterns. We propose DenseGait, the largest dataset for pretraining gait analysis systems containing 217 K anonymized tracklets, annotated automatically with 42 appearance attributes. DenseGait is constructed by automatically processing video streams and offers the full array of gait covariates present in the real world. We make the dataset available to the research community. Additionally, we propose GaitFormer, a transformer-based model that after pretraining in a multi-task fashion on DenseGait, achieves 92.5% accuracy on CASIA-B and 85.33% on FVG, without utilizing any manually annotated data. This corresponds to a +14.2% and +9.67% accuracy increase compared to similar methods. Moreover, GaitFormer is able to accurately identify gender information and a multitude of appearance attributes utilizing only movement patterns. The code to reproduce the experiments is made publicly.
引用
收藏
页数:20
相关论文
共 79 条
[11]   WildGait: Learning Gait Representations from Raw Surveillance Streams [J].
Cosma, Adrian ;
Radoi, Ion Emilian .
SENSORS, 2021, 21 (24)
[12]   Multi-Task Learning of Confounding Factors in Pose-Based Gait Recognition [J].
Cosma, Adrian ;
Radoi, Ion Emilian .
2020 19TH ROEDUNET CONFERENCE: NETWORKING IN EDUCATION AND RESEARCH (ROEDUNET), 2020,
[13]   Pedestrian Attribute Recognition At Far Distance [J].
Deng, Yubin ;
Luo, Ping ;
Loy, Chen Change ;
Tang, Xiaoou .
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, :789-792
[14]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[15]   Real-time and robust multiple-view gender classification using gait features in video surveillance [J].
Do, Trung Dung ;
Nguyen, Van Huan ;
Kim, Hakil .
PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (01) :399-413
[16]   Unsupervised Visual Representation Learning by Context Prediction [J].
Doersch, Carl ;
Gupta, Abhinav ;
Efros, Alexei A. .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1422-1430
[17]  
Dong LH, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5884, DOI 10.1109/ICASSP.2018.8462506
[18]  
Dosovitskiy A., 2021, P 9 INT C LEARN REPR
[19]   GaitPart: Temporal Part-based Model for Gait Recognition [J].
Fan, Chao ;
Peng, Yunjie ;
Cao, Chunshui ;
Liu, Xu ;
Hou, Saihui ;
Chi, Jiannan ;
Huang, Yongzhen ;
Li, Qing ;
He, Zhiqiang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14213-14221
[20]   Multi-modal Transformer for Video Retrieval [J].
Gabeur, Valentin ;
Sun, Chen ;
Alahari, Karteek ;
Schmid, Cordelia .
COMPUTER VISION - ECCV 2020, PT IV, 2020, 12349 :214-229