Cross-view and Cross-pose Completion for 3D Human Understanding

被引:0
|
作者
Armando, Matthieu [1 ]
Galaaoui, Salma [1 ]
Baradel, Fabien [1 ]
Lucas, Thomas [1 ]
Leroy, Vincent [1 ]
Bregier, Romain [1 ]
Weinzaepfel, Philippe [1 ]
Rogez, Gregory [1 ]
机构
[1] NAVER LABS Europe, Meylan, France
关键词
D O I
10.1109/CVPR52733.2024.00150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human perception and understanding is a major domain of computer vision which, like many other vision subdomains recently, stands to gain from the use of large models pre-trained on large datasets. We hypothesize that the most common pre-training strategy of relying on general purpose, object-centric image datasets such as ImageNet, is limited by an important domain shift. On the other hand, collecting domain-specific ground truth such as 2D or 3D labels does not scale well. Therefore, we propose a pre-training approach based on self-supervised learning that works on human-centric data using only images. Our method uses pairs of images of humans: the first is partially masked and the model is trained to reconstruct the masked parts given the visible ones and a second image. It relies on both stereoscopic (cross-view) pairs, and temporal (cross-pose) pairs taken from videos, in order to learn priors about 3D as well as human motion. We pre-train a model for body-centric tasks and one for hand-centric tasks. With a generic transformer architecture, these models outperform existing self-supervised pre-training methods on a wide set of human-centric downstream tasks, and obtain state-of-the-art performance for instance when fine-tuning for model-based and model-free human mesh recovery.
引用
收藏
页码:1512 / 1523
页数:12
相关论文
共 50 条
  • [1] Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS
    Chen, Long
    Ai, Haizhou
    Chen, Rui
    Zhuang, Zijie
    Liu, Shuang
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3276 - 3285
  • [2] Cross View Fusion for 3D Human Pose Estimation
    Qiu, Haibo
    Wang, Chunyu
    Wang, Jingdong
    Wang, Naiyan
    Zeng, Wenjun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4341 - 4350
  • [3] UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-view and Temporal Cues
    Davao, Vandad
    Ghorbani, Saeed
    Carbonneau, Marc-Andre
    Messier, Alexandre
    Etemad, Ali
    COMPUTER VISION - ECCV 2024, PT XVI, 2025, 15074 : 19 - 38
  • [4] Cross-View Self-fusion for Self-supervised 3D Human Pose Estimation in the Wild
    Kim, Hyun-Woo
    Lee, Gun-Hee
    Oh, Myeong-Seok
    Lee, Seong-Whan
    COMPUTER VISION - ACCV 2022, PT I, 2023, 13841 : 193 - 210
  • [5] Convolutional Cross-View Pose Estimation
    Xia, Zimin
    Booij, Olaf
    Kooij, Julian F. P.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3813 - 3831
  • [6] Human pose estimation based on cross-view feature fusion
    Sun, Dandan
    Wang, Siqi
    Xia, Hailun
    Zhang, Changan
    Gao, Jianlong
    Mao, Mingyu
    VISUAL COMPUTER, 2024, 40 (09): : 6581 - 6597
  • [7] Fast Landmark Localization With 3D Component Reconstruction and CNN for Cross-Pose Recognition
    Hsu, Gee-Sern
    Shie, Hung-Cheng
    Hsieh, Cheng-Hua
    Chan, Jui-Shan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (11) : 3194 - 3207
  • [8] Weakly-Supervised 3D Human Pose Estimation With Cross-View U-Shaped Graph Convolutional Network
    Hua, Guoliang
    Liu, Hong
    Li, Wenhao
    Zhang, Qian
    Ding, Runwei
    Xu, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1832 - 1843
  • [9] 3D Human Action Representation Learning via Cross-View Consistency Pursuit
    Li, Linguo
    Wang, Minsi
    Ni, Bingbing
    Wang, Hang
    Yang, Jiancheng
    Zhang, Wenjun
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4739 - 4748
  • [10] Cross-view Transformer for enhanced multi-view 3D reconstruction
    Shi, Wuzhen
    Yin, Aixue
    Li, Yingxiang
    Qian, Bo
    VISUAL COMPUTER, 2024,