NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions

被引：7

作者：

Zhang, Juze ^{[1
,2
,3
,4
]}

Luo, Haimin ^{[1
,4
,5
]}

Yang, Hongdi ^{[1
,4
]}

Xu, Xinru ^{[1
,4
]}

Wu, Qianyang ^{[1
,4
]}

Shi, Ye ^{[1
,4
]}

Yu, Jingyi ^{[1
,4
]}

Xu, Lan ^{[1
,4
]}

Wang, Jingya ^{[1
,4
]}

机构：

[1] ShanghaiTech Univ, Shanghai, Peoples R China

[2] Chinese Acad Sci, Shanghai Adv Res Inst, Shanghai, Peoples R China

[3] Univ Chinese Acad Sci, Shanghai, Peoples R China

[4] Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai, Peoples R China

[5] LumiAni Technol, Hemel Hempstead, England

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00853

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Humans constantly interact with objects in daily life tasks. Capturing such processes and subsequently conducting visual inferences from a fixed viewpoint suffers from occlusions, shape and texture ambiguities, motions, etc. To mitigate the problem, it is essential to build a training dataset that captures free-viewpoint interactions. We construct a dense multi-view dome to acquire a complex human object interaction dataset, named HODome, that consists of similar to 71M frames on 10 subjects interacting with 23 objects. To process the HODome dataset, we develop NeuralDome, a layer-wise neural processing pipeline tailored for multi-view video inputs to conduct accurate tracking, geometry reconstruction and free-view rendering, for both human subjects and objects. Extensive experiments on the HODome dataset demonstrate the effectiveness of NeuralDome on a variety of inference, modeling, and rendering tasks. Both the dataset and the NeuralDome tools will be disseminated to the community for further development, which can be found at https://juzezhang.github.io/NeuralDome

引用

页码：8834 / 8845

页数：12

共 81 条

[1] Aliev Kara-Ali, 2020, Computer Vision - ECCV 2020 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12367), P696, DOI 10.1007/978-3-030-58542-6_42
[2] [Anonymous], PROC CVPR IEEE
[3] [Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.00466
[4] [Anonymous], 2021, EAS MAK HUM MOT CAPT
[5] BESL PJ, 1992, P SOC PHOTO-OPT INS, V1611, P586, DOI 10.1117/12.57955
[6] Bhatnagar Bharat Lal, 2022, P IEEE C COMPUTER VI, P15935
[7] Cai Hongrui, 2022, 36 C NEUR INF PROC S
[8] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Cao, Zhe
Simon, Tomas
Wei, Shih-En
Sheikh, Yaser
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310
[9] capturingreality, REAL CAPT
[10] MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo
Chen, Anpei
Xu, Zexiang
Zhao, Fuqiang
Zhang, Xiaoshuai
Xiang, Fanbo
Yu, Jingyi
Su, Hao
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14104 - 14113

← 1 2 3 4 5 6 7 8 9 →