OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement

被引:0
作者
Cui, Mingyue [1 ]
Long, Junhua [1 ]
Feng, Mingjian [1 ]
Li, Boyang [1 ]
Huang, Kai [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[2] Sun Yat Sen Univ, Shenzhen Inst, Beijing, Peoples R China
来源
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Point cloud compression with a higher compression ratio and tiny loss is essential for efficient data transportation. However, previous methods that depend on 3D convolution or frequent multi-head self-attention operations bring huge computations. To address this problem, we propose an octree-based Transformer compression method called OctFormer, which does not rely on the occupancy information of sibling nodes. Our method uses non-overlapped context windows to construct octree node sequences and share the result of a multi-head self-attention operation among a sequence of nodes. Besides, we introduce a locally-enhance module for exploiting the sibling features and a positional encoding generator for enhancing the translation invariance of the octree node sequence. Compared to the previous state-of-the-art works, our method obtains up to 17% Bpp savings compared to the voxel-context-based baseline and saves an overall 99% coding time compared to the attention-based baseline.
引用
收藏
页码:470 / 478
页数:9
相关论文
共 35 条
[1]  
Nguyen A, 2013, PROCEEDINGS OF THE 2013 6TH IEEE CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), P225, DOI 10.1109/RAM.2013.6758588
[2]  
Ba J. L., 2016, Layer normalization
[3]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[4]   A Volumetric Approach to Point Cloud Compression-Part I: Attribute Compression [J].
Chou, Philip A. ;
Koroteev, Maxim ;
Krivokuca, Maja .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2203-2216
[5]  
Chu XX, 2021, Arxiv, DOI arXiv:2102.10882
[6]   ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].
Dai, Angela ;
Chang, Angel X. ;
Savva, Manolis ;
Halber, Maciej ;
Funkhouser, Thomas ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443
[7]  
Dosovitskiy A., 2020, ICLR 2021
[8]  
Dricot A, 2018, IEEE IMAGE PROC, P2969, DOI 10.1109/ICIP.2018.8451172
[9]   Real-Time Spatio-Temporal LiDAR Point Cloud Compression [J].
Feng, Yu ;
Liu, Shaoshan ;
Zhu, Yuhao .
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :10766-10773
[10]  
Fu Chunyang, 2022, P AAAI C ART INT