MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image

被引:31
作者
Chen, Xingyu [1 ]
Liu, Yufeng [3 ]
Dong, Yajiao [1 ]
Zhang, Xiong [2 ]
Ma, Chongyang [1 ]
Xiong, Yanmin [1 ]
Zhang, Yuan [1 ]
Guo, Xiaoyan [1 ]
机构
[1] Kuaishou Technol, Y Tech, Beijing, Peoples R China
[2] Baidu Inc, YY Live, Beijing, Peoples R China
[3] Southeast Univ, SEU ALLEN Joint Ctr, Inst Brain & Intelligence, Nanjing, Peoples R China
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.01989
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a framework for single-view hand mesh reconstruction, which can simultaneously achieve high reconstruction accuracy, fast inference speed, and temporal coherence. Specifically, for 2D encoding, we propose lightweight yet effective stacked structures. Regarding 3D decoding, we provide an efficient graph operator, namely depth-separable spiral convolution. Moreover, we present a novel feature lifting module for bridging the gap between 2D and 3D representations. This module begins with a map-based position regression (MapReg) block to integrate the merits of both heatmap encoding and position regression paradigms for improved 2D accuracy and temporal coherence. Furthermore, MapReg is followed by pose pooling and pose-to-vertex lifting approaches, which transform 2D pose encodings to semantic features of 3D vertices. Overall, our hand reconstruction framework, called MobRecon, comprises affordable computational costs and miniature model size, which reaches a high inference speed of 83FPS on Apple A14 CPU. Extensive experiments on popular datasets such as FreiHAND, RHD, and HO3Dv2 demonstrate that our MobRecon achieves superior performance on reconstruction accuracy and temporal coherence.
引用
收藏
页码:20512 / 20522
页数:11
相关论文
共 82 条
  • [1] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01114
  • [2] [Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00316
  • [3] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00576
  • [4] [Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.01110
  • [5] [Anonymous], 2014, NIPS 2014 WORKSH DEE
  • [6] [Anonymous], 2017, TOG, DOI DOI 10.1145/3130800.3130883
  • [7] Pushing the Envelope for RGB-based Dense 3D Hand Pose Estimation via Neural Rendering
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1067 - 1076
  • [8] Baek Seungryul, 2020, CVPR, V1
  • [9] Bhatnagar Bharat Lal, 2020, NeurIPS
  • [10] Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images
    Cai, Yujun
    Ge, Liuhao
    Cai, Jianfei
    Yuan, Junsong
    [J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 678 - 694