Multi-view Shape Generation for a 3D Human-like Body

被引:15
作者
Yu, Hang [1 ]
Cheang, Chilam [2 ]
Fu, Yanwei [3 ,4 ]
Xue, Xiangyang [2 ]
机构
[1] Fudan Univ, Acad Engn & Technol, Shanghai, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[3] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[4] Zhejiang Normal Univ, ISTBI ZJNU Algorithm Ctr Brain Inspired Intellige, Jinhua, Zhejiang, Peoples R China
关键词
3D reconstruction; human body reconstruction; multi-view stereo;
D O I
10.1145/3514248
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Three-dimensional (3D) human-like body reconstruction via a single RGB image has attracted significant research attention recently. Most of the existing methods rely on the Skinned Multi-Person Linear model and thus can only predict unified human bodies. Moreover, meshes reconstructed by current methods sometimes perform well from a canonical view but not from other views, as the reconstruction process is commonly supervised by only a single view. To address these limitations, this article proposes a multi-view shape generation network for a 3D human-like body. Particularly, we propose a coarse-to-fine learning model that gradually deforms a template body toward the ground truth body. Our model utilizes the information of multi-view renderings and corresponding 3D vertex transformation as supervision. Such supervision will help to generate 3D bodies well aligned to all views. To accurately operate mesh deformation, a graph convolutional network structure is introduced to support the shape generation from 3D vertex representation. Additionally, a graph up-pooling operation is designed over the intermediate representations of the graph convolutional network, and thus our model can generate 3D shapes with higher resolution. Novel loss functions are employed to help optimize the whole multi-view generation model, resulting in smoother surfaces. In addition, twomulti-view human body datasets are produced and contributed to the community. Extensive experiments conducted on the benchmark datasets demonstrate the efficacy of our model over the competitors.
引用
收藏
页数:22
相关论文
共 54 条
  • [21] Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments
    Ionescu, Catalin
    Papava, Dragos
    Olaru, Vlad
    Sminchisescu, Cristian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) : 1325 - 1339
  • [22] Johnson S, 2010, BMVC, DOI [10.5244/C.24.12, DOI 10.5244/C.24.12]
  • [23] End-to-end Recovery of Human Shape and Pose
    Kanazawa, Angjoo
    Black, Michael J.
    Jacobs, David W.
    Malik, Jitendra
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7122 - 7131
  • [24] Convolutional Mesh Regression for Single-Image Human Shape Reconstruction
    Kolotouros, Nikos
    Pavlakos, Georgios
    Daniilidis, Kostas
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4496 - 4505
  • [25] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [26] Unite the People: Closing the Loop Between 3D and 2D Human Representations
    Lassner, Christoph
    Romero, Javier
    Kiefel, Martin
    Bogo, Federica
    Black, Michael J.
    Gehler, Peter V.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4704 - 4713
  • [27] Lewiner T., 2003, Journal of Graphics Tools, V8, P1, DOI 10.1080/10867651.2003.10487582
  • [28] Shape-Aware Human Pose and Shape Reconstruction Using Multi-View Images
    Liang, Junbang
    Lin, Ming C.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4351 - 4361
  • [29] SMPL: A Skinned Multi-Person Linear Model
    Loper, Matthew
    Mahmood, Naureen
    Romero, Javier
    Pons-Moll, Gerard
    Black, Michael J.
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (06):
  • [30] Lorensen W.E., 1992, COMPUT GRAPH, V21, P163, DOI DOI 10.1145/37402.37422