TSC-PCAC: Voxel Transformer and Sparse Convolution-Based Point Cloud Attribute Compression for 3D Broadcasting

被引:0
作者
Guo, Zixi [1 ]
Zhang, Yun [1 ]
Zhu, Linwei [2 ]
Wang, Hanli [3 ]
Jiang, Gangyi [4 ]
机构
[1] Sun Yat sen Univ, Sch Elect & Commun Engn, Shenzhen Campus, Shenzhen 518017, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[3] Tongji Univ, Dept Comp Sci & Technol, Shanghai 200092, Peoples R China
[4] Ningbo Univ, Fac Informat & Sci & Engn, Ningbo 315211, Peoples R China
基金
中国国家自然科学基金;
关键词
Point cloud compression; Image coding; Convolution; Three-dimensional displays; Geometry; Transforms; Transformers; voxel transformer; sparse convolution; variational autoencoder; channel context module;
D O I
10.1109/TBC.2024.3464417
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Point cloud has been the mainstream representation for advanced 3D applications, such as virtual reality and augmented reality. However, the massive data amounts of point clouds is one of the most challenging issues for transmission and storage. In this paper, we propose an end-to-end voxel Transformer and Sparse Convolution based Point Cloud Attribute Compression (TSC-PCAC) for 3D broadcasting. Firstly, we present a framework of the TSC-PCAC, which includes Transformer and Sparse Convolutional Module (TSCM) based variational autoencoder and channel context module. Secondly, we propose a two-stage TSCM, where the first stage focuses on modeling local dependencies and feature representations of the point clouds, and the second stage captures global features through spatial and channel pooling encompassing larger receptive fields. This module effectively extracts global and local inter-point relevance to reduce informational redundancy. Thirdly, we design a TSCM based channel context module to exploit inter-channel correlations, which improves the predicted probability distribution of quantized latent representations and thus reduces the bitrate. Experimental results indicate that the proposed TSC-PCAC method achieves an average of 38.53%, 21.30%, and 11.19% bitrate reductions on datasets 8iVFB, Owlii, 8iVSLF, Volograms, and MVUB compared to the Sparse-PCAC, NF-PCAC, and G-PCC v23 methods, respectively. The encoding/decoding time costs are reduced 97.68%/98.78% on average compared to the Sparse-PCAC. The source code and the trained TSC-PCAC models are available at https://github.com/igizuxo/TSC-PCAC.
引用
收藏
页码:154 / 166
页数:13
相关论文
共 51 条
  • [1] Inter-Frame Compression for Dynamic Point Cloud Geometry Coding
    Akhtar, Anique
    Li, Zhu
    van der Auwera, Geert
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 584 - 594
  • [2] Ball J., 2018, C LEARN REPR ICLR, DOI [10.48550/arXiv.1802.01436, DOI 10.48550/ARXIV.1802.01436]
  • [3] Bjontegaard G., 2001, VCEGM33
  • [4] Compression of Sparse and Dense Dynamic Point Clouds-Methods and Standards
    Cao, Chao
    Preda, Marius
    Zakharchenko, Vladyslav
    Jang, Euee S.
    Zaharia, Titus
    [J]. PROCEEDINGS OF THE IEEE, 2021, 109 (09) : 1537 - 1558
  • [5] 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
    Choy, Christopher
    Gwak, JunYoung
    Savarese, Silvio
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3070 - 3079
  • [6] Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform
    de Queiroz, Ricardo L.
    Chou, Philip A.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (08) : 3947 - 3956
  • [7] dEon E., 2017, WG11WG1MPEGJPEGWG11M
  • [8] 3DAC: Learning Attribute Compression for Point Clouds
    Fang, Guangchi
    Hu, Qingyong
    Wang, Hanyun
    Xu, Yiling
    Guo, Yulan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14799 - 14808
  • [9] POINT CLOUD GEOMETRY COMPRESSION VIA NEURAL GRAPH SAMPLING
    Gao, Linyao
    Fan, Tingyu
    Wan, Jianqiang
    Xu, Yiling
    Sun, Jun
    Ma, Zhan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3373 - 3377
  • [10] Point Cloud Compression Based on Joint Optimization of Graph Transform and Entropy Coding for Efficient Data Broadcasting
    Gao, Pan
    Zhang, Lijuan
    Lei, Lei
    Xiang, Wei
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2023, 69 (03) : 727 - 739