3D Directional Encoding for Point Cloud Analysis

被引:0
作者
Jung, Yoonjae [1 ]
Lee, Sang-Hyun [2 ]
Seo, Seung-Woo [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Ajou Univ, Dept AI Mobil Engn, Suwon 16499, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Vectors; Point cloud compression; Three-dimensional displays; Encoding; Transformers; Network architecture; Data mining; Computer architecture; Neural networks; Information retrieval; Classification; deep learning; directional feature extraction; efficient neural network; point cloud; segmentation;
D O I
10.1109/ACCESS.2024.3472301
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting informative local features in point clouds is crucial for accurately understanding spatial information inside 3D point data. Previous works utilize either complex network designs or simple multi-layer perceptrons (MLP) to extract the local features. However, complex networks often incur high computational cost, whereas simple MLP may struggle to capture the spatial relations among local points effectively. These challenges limit their scalability to delicate and real-time tasks, such as autonomous driving and robot navigation. To address these challenges, we propose a novel 3D Directional Encoding Network (3D-DENet) capable of effectively encoding spatial relations with low computational cost. 3D-DENet extracts spatial and point features separately. The key component of 3D-DENet for spatial feature extraction is Directional Encoding (DE), which encodes the cosine similarity between direction vectors of local points and trainable direction vectors. To extract point features, we also propose Local Point Feature Multi-Aggregation (LPFMA), which integrates various aspects of local point features using diverse aggregation functions. By leveraging DE and LPFMA in a hierarchical structure, 3D-DENet efficiently captures both detailed spatial and high-level semantic features from point clouds. Experiments show that 3D-DENet is effective and efficient in classification and segmentation tasks. In particular, 3D-DENet achieves an overall accuracy of 90.7% and a mean accuracy of 90.1% on ScanObjectNN, outperforming the current state-of-the-art method while using only 47% floating point operations.
引用
收藏
页码:144533 / 144543
页数:11
相关论文
共 33 条
[1]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[2]  
Chen BJ, 2023, Arxiv, DOI arXiv:2308.16532
[3]   Direct LiDAR Odometry: Fast Localization With Dense Point Clouds [J].
Chen, Kenny ;
Lopez, Brett T. ;
Agha-mohammadi, Ali-akbar ;
Mehta, Ankur .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) :2000-2007
[4]   PointVector: A Vector Representation In Point Cloud Analysis [J].
Deng, Xin ;
Zhang, Wenyu ;
Ding, Qing ;
Zhang, XinMing .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :9455-9465
[5]  
Goyal A, 2021, PR MACH LEARN RES, V139
[6]   MVTN: Multi-View Transformation Network for 3D Shape Recognition [J].
Hamdi, Abdullah ;
Giancola, Silvio ;
Ghanem, Bernard .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :1-11
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]  
Leng Z., 2022, P INT C LEARN REPR
[9]  
Li YZ, 2018, ADV NEUR IN, V31
[10]   Meta Architecture for Point Cloud Analysis [J].
Lin, Haojia ;
Zheng, Xiawu ;
Li, Lijiang ;
Chao, Fei ;
Wang, Shanshan ;
Wang, Yan ;
Tian, Yonghong ;
Ji, Rongrong .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :17682-17691