TSC-PCAC: Voxel Transformer and Sparse Convolution-Based Point Cloud Attribute Compression for 3D Broadcasting

被引：0

作者：

Guo, Zixi ^{[1
]}

Zhang, Yun ^{[1
]}

Zhu, Linwei ^{[2
]}

Wang, Hanli ^{[3
]}

Jiang, Gangyi ^{[4
]}

机构：

[1] Sun Yat sen Univ, Sch Elect & Commun Engn, Shenzhen Campus, Shenzhen 518017, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China

[3] Tongji Univ, Dept Comp Sci & Technol, Shanghai 200092, Peoples R China

[4] Ningbo Univ, Fac Informat & Sci & Engn, Ningbo 315211, Peoples R China

来源：

IEEE TRANSACTIONS ON BROADCASTING | 2025年 / 71卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Point cloud compression; Image coding; Convolution; Three-dimensional displays; Geometry; Transforms; Transformers; voxel transformer; sparse convolution; variational autoencoder; channel context module;

D O I：

10.1109/TBC.2024.3464417

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Point cloud has been the mainstream representation for advanced 3D applications, such as virtual reality and augmented reality. However, the massive data amounts of point clouds is one of the most challenging issues for transmission and storage. In this paper, we propose an end-to-end voxel Transformer and Sparse Convolution based Point Cloud Attribute Compression (TSC-PCAC) for 3D broadcasting. Firstly, we present a framework of the TSC-PCAC, which includes Transformer and Sparse Convolutional Module (TSCM) based variational autoencoder and channel context module. Secondly, we propose a two-stage TSCM, where the first stage focuses on modeling local dependencies and feature representations of the point clouds, and the second stage captures global features through spatial and channel pooling encompassing larger receptive fields. This module effectively extracts global and local inter-point relevance to reduce informational redundancy. Thirdly, we design a TSCM based channel context module to exploit inter-channel correlations, which improves the predicted probability distribution of quantized latent representations and thus reduces the bitrate. Experimental results indicate that the proposed TSC-PCAC method achieves an average of 38.53%, 21.30%, and 11.19% bitrate reductions on datasets 8iVFB, Owlii, 8iVSLF, Volograms, and MVUB compared to the Sparse-PCAC, NF-PCAC, and G-PCC v23 methods, respectively. The encoding/decoding time costs are reduced 97.68%/98.78% on average compared to the Sparse-PCAC. The source code and the trained TSC-PCAC models are available at https://github.com/igizuxo/TSC-PCAC.

引用

页码：154 / 166

页数：13

共 51 条

[1] Inter-Frame Compression for Dynamic Point Cloud Geometry Coding
Akhtar, Anique
Li, Zhu
van der Auwera, Geert
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 584 - 594
[2] Ball J., 2018, C LEARN REPR ICLR, DOI [10.48550/arXiv.1802.01436, DOI 10.48550/ARXIV.1802.01436]
[3] Bjontegaard G., 2001, VCEGM33
[4] Compression of Sparse and Dense Dynamic Point Clouds-Methods and Standards
Cao, Chao
Preda, Marius
Zakharchenko, Vladyslav
Jang, Euee S.
Zaharia, Titus
[J]. PROCEEDINGS OF THE IEEE, 2021, 109 (09) : 1537 - 1558
[5] 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
Choy, Christopher
Gwak, JunYoung
Savarese, Silvio
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3070 - 3079
[6] Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform
de Queiroz, Ricardo L.
Chou, Philip A.
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (08) : 3947 - 3956
[7] dEon E., 2017, WG11WG1MPEGJPEGWG11M
[8] 3DAC: Learning Attribute Compression for Point Clouds
Fang, Guangchi
Hu, Qingyong
Wang, Hanyun
Xu, Yiling
Guo, Yulan
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14799 - 14808
[9] POINT CLOUD GEOMETRY COMPRESSION VIA NEURAL GRAPH SAMPLING
Gao, Linyao
Fan, Tingyu
Wan, Jianqiang
Xu, Yiling
Sun, Jun
Ma, Zhan
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3373 - 3377
[10] Point Cloud Compression Based on Joint Optimization of Graph Transform and Entropy Coding for Efficient Data Broadcasting
Gao, Pan
Zhang, Lijuan
Lei, Lei
Xiang, Wei
[J]. IEEE TRANSACTIONS ON BROADCASTING, 2023, 69 (03) : 727 - 739

← 1 2 3 4 5 6 →