Neural Descent for Visual 3D Human Pose and Shape

被引:35
作者
Zanfir, Andrei [1 ]
Bazavan, Eduard Gabriel [1 ]
Zanfir, Mihai [1 ]
Freeman, William T. [1 ]
Sukthankar, Rahul [1 ]
Sminchisescu, Cristian [1 ]
机构
[1] Google Res, Bangalore, Karnataka, India
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.01425
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present deep neural network methodology to reconstruct the 3d pose and shape of people, including hand gestures and facial expression, given an input RGB image. We rely on a recently introduced, expressive full body statistical 3d human model, GHUM, trained end-to-end, and learn to reconstruct its pose and shape state in a self-supervised regime. Central to our methodology, is a learning to learn and optimize approach, referred to as HUman Neural Descent (HUND), which avoids both second-order differentiation when training the model parameters, and expensive state gradient descent in order to accurately minimize a semantic differentiable rendering loss at test time. Instead, we rely on novel recurrent stages to update the pose and shape parameters such that not only losses are minimized effectively, but the process is meta-regularized in order to ensure endprogress. HUND's symmetry between training and testing makes it the first 3d human sensing architecture to natively support different operating regimes including self-supervised ones. In diverse tests, we show that HUND achieves very competitive results in datasets like H3.6M and 3DPW, as well as good quality 3d reconstructions for complex imagery collected in-the-wild.
引用
收藏
页码:14479 / 14488
页数:10
相关论文
共 49 条
  • [1] Andrychowicz M, 2016, ADV NEUR IN, V29
  • [2] Organizational Capabilities and Profitability: The Mediating Role of Business Strategy
    Angeles Lopez-Cabarcos, M.
    Goettling-Oliveira-Monteiro, Sergio
    Vazquez-Rodriguez, Paula
    [J]. SAGE OPEN, 2015, 5 (04):
  • [3] [Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00530
  • [4] [Anonymous], 2003, CVPR
  • [5] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.138
  • [6] Long short-term memory
    Hochreiter, S
    Schmidhuber, J
    [J]. NEURAL COMPUTATION, 1997, 9 (08) : 1735 - 1780
  • [7] [Anonymous], 2018, ECCV, DOI DOI 10.1007/978-3-030-01234-2_2
  • [8] Arnab Anurag, 2019, EXPLOITING TEMPORAL
  • [9] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
    Bogo, Federica
    Kanazawa, Angjoo
    Lassner, Christoph
    Gehler, Peter
    Romero, Javier
    Black, Michael J.
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
  • [10] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186