Synthetic Humans for Action Recognition from Unseen Viewpoints

被引:61
作者
Varol, Gul [1 ]
Laptev, Ivan [2 ]
Schmid, Cordelia [2 ]
Zisserman, Andrew [3 ]
机构
[1] Univ Gustave Eiffel, CNRS, Ecole Ponts, LIGM, Champs Sur Marne, France
[2] INRIA, Paris, France
[3] Univ Oxford, Visual Geometry Grp, Oxford, England
基金
英国工程与自然科学研究理事会;
关键词
Synthetic humans; Action recognition; REPRESENTATIONS;
D O I
10.1007/s11263-021-01467-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although synthetic training data has been shown to be beneficial for tasks such as human pose estimation, its use for RGB human action recognition is relatively unexplored. Our goal in this work is to answer the question whether synthetic humans can improve the performance of human action recognition, with a particular focus on generalization to unseen viewpoints. We make use of the recent advances in monocular 3D human body reconstruction from real action sequences to automatically render synthetic training videos for the action labels. We make the following contributions: (1) we investigate the extent of variations and augmentations that are beneficial to improving performance at new viewpoints. We consider changes in body shape and clothing for individuals, as well as more action relevant augmentations such as non-uniform frame sampling, and interpolating between the motion of individuals performing the same action; (2) We introduce a new data generation methodology, SURREACT, that allows training of spatio-temporal CNNs for action classification; (3) We substantially improve the state-of-the-art action recognition performance on the NTU RGB+D and UESTC standard human action multi-view benchmarks; Finally, (4) we extend the augmentation approach to in-the-wild videos from a subset of the Kinetics dataset to investigate the case when only one-shot training data is available, and demonstrate improvements in this case as well.
引用
收藏
页码:2264 / 2287
页数:24
相关论文
共 98 条
[1]  
[Anonymous], CARNEGIE MELLON MOCA
[2]  
[Anonymous], SURREACT PROJECT
[3]  
[Anonymous], 2011, CVPR 2011, DOI DOI 10.1109/CVPR.2011.5995316
[4]  
[Anonymous], 2016, 2016 IEEE C COMPUTER, DOI DOI 10.1109/CVPR.2016.115
[5]  
[Anonymous], 2017, ABS170310106 CORR
[6]   Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points [J].
Baradel, Fabien ;
Wolf, Christian ;
Mille, Julien ;
Taylor, Graham W. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :469-478
[7]  
Black, 2019, GCPR
[8]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[9]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[10]  
Chen C., 2020, ARXIV PREPRINT ARXIV