Pose-Driven Compression for Dynamic 3D Human via Human Prior Models

被引：0

作者：

Yan, Ruoke ^{[1
]}

Yin, Qian ^{[2
]}

Zhang, Xinfeng ^{[2
]}

Zhang, Qi

Zhang, Gai

Ma, Siwei

机构：

[1] Peking Univ, Sch Comp Sci, Natl Engn Lab Video Technol, Beijing 100871, Peoples R China

[2] Peking Univ, Sch Comp Sci, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 08期

基金：

国家重点研发计划;

关键词：

Three-dimensional displays; Point cloud compression; Bit rate; Image coding; Dynamics; Solid modeling; Data models; Dynamic 3D human compression; human prior models; pose-driven representation;

D O I：

10.1109/TPAMI.2024.3368567

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To cost-effectively transmit high-quality dynamic 3D human images in immersive multimedia applications, efficient data compression is crucial. Unlike existing methods that focus on reducing signal-level reconstruction errors, we propose the first dynamic 3D human compression framework based on human priors. The layered coding architecture significantly enhances the perceptual quality while also supporting a variety of downstream tasks, including visual analysis and content editing. Specifically, a high-fidelity pose-driven Avatar is generated from the original frames as the basic structure layer to implicitly represent the human shape. Then, human movements between frames are parameterized via a commonly-used human prior model, i.e., the Skinned Multi-Person Linear Model (SMPL), to form the motion layer and drive the Avatar. Furthermore, the normals are also introduced as an enhancement layer to preserve fine-grained geometric details. Finally, the Avatar, SMPL parameters, and normal maps are efficiently compressed into layered semantic bitstreams. Extensive qualitative and quantitative experiments show that the proposed framework remarkably outperforms other state-of-the-art 3D codecs in terms of subjective quality with only a few bits. More notably, as the size or frame number of the 3D human sequence increases, the superiority of our framework in perceptual quality becomes more significant while saving more bitrates.

引用

页码：5820 / 5834

页数：15

共 19 条

[1] Inter-Frame Compression for Dynamic Point Cloud Geometry Coding
Akhtar, Anique
Li, Zhu
van der Auwera, Geert
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 584 - 594
[2] Fan TY, 2022, PROCEEDINGS OF THE THIRTY-FIRST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2022, P898
[3] Google, Point cloud compression reference software
[4] OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression
Huang, Lila
Wang, Shenlong
Wong, Kelvin
Liu, Jerry
Urtasun, Raquel
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1310 - 1320
[5] 3D Point Cloud Geometry Compression on Deep Learning
Huang, Tianxin
Liu, Yong
[J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 890 - 898
[6] Introduction to the Special Issue on Recent Advances in Point Cloud Processing and Compression
Li, Zhu
Liu, Shan
Dufaux, Frederic
Li, Li
Li, Ge
Kuo, C-C Jay
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4555 - 4560
[7] SMPL: A Skinned Multi-Person Linear Model
Loper, Matthew
Mahmood, Naureen
Romero, Javier
Pons-Moll, Gerard
Black, Michael J.
[J]. ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (06):
[8] MPEG 3D Graphics and Haptics Coding Group, 2022, ISO/IEC JTC 1/SC 29/WG 7 MPEG output document N00360
[9] Lossless Coding of Point Cloud Geometry Using a Deep Generative Model
Nguyen, Dat Thanh
Quach, Maurice
Valenzise, Giuseppe
Duhamel, Pierre
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (12) : 4617 - 4629
[10] Quach M, 2019, IEEE IMAGE PROC, P4320, DOI [10.1109/ICIP.2019.8803413, 10.1109/icip.2019.8803413]

← 1 2 →