Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

被引:18
作者
Guan, Shanyan [1 ]
Xu, Jingwei [1 ]
He, Michelle Zhang [1 ]
Wang, Yunbo [1 ]
Ni, Bingbing [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
关键词
Training; Adaptation models; Three-dimensional displays; Videos; Optimization; Solid modeling; Data models; 3D exemplar guidance; bilevel online adaptation; dynamic update; out-of-domain human mesh reconstruction; HUMAN SHAPE; POSE;
D O I
10.1109/TPAMI.2022.3194167
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos, where the performance of existing SMPL-based models is significantly affected by the distribution shift represented by different camera parameters, bone lengths, backgrounds, and occlusions. We tackle this problem through online adaptation, gradually correcting the model bias during testing. There are two main challenges: First, the lack of 3D annotations increases the training difficulty and results in 3D ambiguities. Second, non-stationary data distribution makes it difficult to strike a balance between fitting regular frames and hard samples with severe occlusions or dramatic changes. To this end, we propose the Dynamic Bilevel Online Adaptation algorithm (DynaBOA). It first introduces the temporal constraints to compensate for the unavailable 3D annotations and leverages a bilevel optimization procedure to address the conflicts between multi-objectives. DynaBOA provides additional 3D guidance by co-training with similar source examples retrieved efficiently despite the distribution shift. Furthermore, it can adaptively adjust the number of optimization steps on individual frames to fully fit hard samples and avoid overfitting regular frames. DynaBOA achieves state-of-the-art results on three out-of-domain human mesh reconstruction benchmarks.
引用
收藏
页码:5070 / 5086
页数:17
相关论文
共 85 条
[1]   Structured Prediction Helps 3D Human Motion Modelling [J].
Aksan, Emre ;
Kaufmann, Manuel ;
Hilliges, Otmar .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :7143-7152
[2]  
[Anonymous], 2008, PROC INT C NEURAL IN
[3]  
[Anonymous], 2014, VICON MOCAP SYSTEM
[4]  
[Anonymous], 2018, Blender-a 3D modelling and rendering package, blender foundation
[5]  
[Anonymous], 2016, CAPTURY
[6]   Exploiting temporal context for 3D human pose estimation in the wild [J].
Arnab, Anurag ;
Doersch, Carl ;
Zisserman, Andrew .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3390-3399
[7]  
Berg A.C., 2018, EUR C COMPUT VIS, P569
[8]  
Bobu A., 2018, Adapting to continuously shifting domains
[9]   Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image [J].
Bogo, Federica ;
Kanazawa, Angjoo ;
Lassner, Christoph ;
Gehler, Peter ;
Romero, Javier ;
Black, Michael J. .
COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :561-578
[10]  
Broderick T., 2013, Advances in neural information processing systems, V26, P1727