Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

被引:0
作者
Kim, Kyung Min [1 ]
Song, Byung Cheol [1 ]
机构
[1] Inha Univ, Dept Elect & Comp Engn, Incheon 22212, South Korea
关键词
Three-dimensional displays; Videos; Cameras; Image reconstruction; Avatars; Shape; Pose estimation; Depth measurement; Rendering (computer graphics); Accuracy; 3D avatar reconstruction; deformable 3D avatar; monocular video integration; structural alignment in 3D;
D O I
10.1109/ACCESS.2025.3556445
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel approach for 3D human avatar reconstruction from monocular RGB videos, overcoming the limitations of existing template-based methods such as BANMo. We introduce a two-fold optimization framework: first, using RelPose++ for accurate camera pose estimation and second, incorporating depth maps for enhancing 3D shape reconstruction. Our method minimizes so-called intra-frame and inter-frame distances, optimizing both detailed frame-level accuracy and maintaining temporal coherence across multiple video frames. Extensive experiments on the MEAD, Multiface and FEED datasets demonstrate the superiority of our approach in generating realistic, deformable 3D avatars, achieving significant improvements in Chamfer distance and F-score compared to existing methods. This framework is particularly effective in complex scenarios, such as bust-shot videos with partial views of subjects, offering robust and high-quality 3D reconstructions.
引用
收藏
页码:57886 / 57897
页数:12
相关论文
共 39 条
[1]   NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior [J].
Bian, Wenjing ;
Wang, Zirui ;
Li, Kejie ;
Bian, Jia-Wang ;
Prisacariu, Victor Adrian .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :4160-4169
[2]   Optimum design of chamfer distance transforms [J].
Butt, MA ;
Maragos, P .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1998, 7 (10) :1477-1484
[3]  
Cai Hongrui, 2022, ADV NEUR IN
[4]   MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar [J].
Chen, Yufan ;
Wang, Lizhen ;
Li, Qijing ;
Xiao, Hongjiang ;
Zhang, Shengping ;
Yao, Hongxun ;
Liu, Yebin .
PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,
[5]   Depth-supervised NeRF: Fewer Views and Faster Training for Free [J].
Deng, Kangle ;
Liu, Andrew ;
Zhu, Jun-Yan ;
Ramanan, Deva .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12872-12881
[6]   EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars [J].
Drobyshev, Nikita ;
Casademunt, Antoni Bigata ;
Vougioukas, Konstantinos ;
Landgraf, Zoe ;
Petridis, Stavros ;
Pantic, Maja .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, :8498-8507
[7]   Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction [J].
Gafni, Guy ;
Thies, Justus ;
Zollhoefer, Michael ;
Niessner, Matthias .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8645-8654
[8]   DensePose: Dense Human Pose Estimation In The Wild [J].
Guler, Riza Alp ;
Neverova, Natalia ;
Kokkinos, Lasonas .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7297-7306
[9]   Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation [J].
He, Shan ;
He, Haonan ;
Yang, Shuo ;
Wu, Xiaoyan ;
Xia, Pengcheng ;
Yin, Bing ;
Liu, Cong ;
Dai, Lirong ;
Xu, Chang .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :14146-14156
[10]   HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion [J].
Isik, Mustafa ;
Ruenz, Martin ;
Georgopoulos, Markos ;
Khakhulin, Taras ;
Starck, Jonathan ;
Agapito, Lourdes ;
Niessner, Matthias .
ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04)