Self-Supervised 3D Representation Learning of Dressed Humans From Social Media Videos

被引:0
|
作者
Jafarian, Yasamin [1 ]
Park, Hyun Soo [1 ]
机构
[1] Univ Minnesota, Minneapolis, MN 55455 USA
关键词
Depth estimation; dataset; high fidelity human reconstruction; normal estimation; single view 3D reconstruction; self-supervised learning;
D O I
10.1109/TPAMI.2022.3231558
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key challenge of learning a visual representation for the 3D high fidelity geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real-world imagery. We address this challenge by leveraging a new data resource: a number of social media dance videos that span diverse appearance, clothing styles, performances, and identities. Each video depicts dynamic movements of the body and clothes of a single person while lacking the 3D ground truth geometry. To learn a visual representation from these videos, we present a new self-supervised learning method to use the local transformation that warps the predicted local geometry of the person from an image to that of another image at a different time instant. This allows self-supervision by enforcing a temporal coherence over the predictions. In addition, we jointly learn the depths along with the surface normals that are highly responsive to local texture, wrinkle, and shade by maximizing their geometric consistency. Our method is end-to-end trainable, resulting in high fidelity depth estimation that predicts fine geometry faithful to the input real image. We further provide a theoretical bound of self-supervised learning via an uncertainty analysis that characterizes the performance of the self-supervised learning without training. We demonstrate that our method outperforms the state-of-the-art human depth estimation and human shape recovery approaches on both real and rendered images.
引用
收藏
页码:8969 / 8983
页数:15
相关论文
共 50 条
  • [31] Multi-View 3D Human Pose Estimation with Self-Supervised Learning
    Chang, Inho
    Park, Min-Gyu
    Kim, Jaewoo
    Yoon, Ju Hong
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 255 - 257
  • [32] Learning on the Rings: Self-Supervised 3D Finger Motion Tracking UsingWearable Sensors
    Zhou, Hao
    Lu, Taiting
    Liu, Yilin
    Zhang, Shijia
    Gowda, Mahanth
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2022, 6 (02):
  • [33] Self-Supervised Learning and 3D Printing Technology in Facial Reconstruction and Defect Coverage
    Tung, N. T.
    Chau, Nguyen Dong
    Nguyen, Nghi N.
    Nguyen, Thanh Q.
    3D PRINTING AND ADDITIVE MANUFACTURING, 2025,
  • [34] Self-supervised learning for accelerated 3D high-resolution ultrasound imaging
    Dai, Xianjin
    Lei, Yang
    Wang, Tonghe
    Axente, Marian
    Xu, Dong
    Patel, Pretesh
    Jani, Ashesh B.
    Curran, Walter J.
    Liu, Tian
    Yang, Xiaofeng
    MEDICAL PHYSICS, 2021, 48 (07) : 3916 - 3926
  • [35] Attention-guided mask learning for self-supervised 3D action recognition
    Zhang, Haoyuan
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (06) : 7487 - 7496
  • [36] Self-supervised monocular depth estimation from oblique UAV videos
    Madhuanand, Logambal
    Nex, Francesco
    Yang, Michael Ying
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 176 : 1 - 14
  • [37] Self-supervised Consensus Representation Learning for Attributed Graph
    Liu, Changshu
    Wen, Liangjian
    Kang, Zhao
    Luo, Guangchun
    Tian, Ling
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2654 - 2662
  • [38] TRIBYOL: TRIPLET BYOL FOR SELF-SUPERVISED REPRESENTATION LEARNING
    Li, Guang
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3458 - 3462
  • [39] ViewMix: Augmentation for Robust Representation in Self-Supervised Learning
    Das, Arjon
    Zhong, Xin
    IEEE ACCESS, 2024, 12 : 8461 - 8470
  • [40] CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
    Meng, Chutong
    Ao, Junyi
    Ko, Tom
    Wang, Mingxuan
    Li, Haizhou
    INTERSPEECH 2023, 2023, : 2978 - 2982