GSV-NET: A Multi-Modal Deep Learning Network for 3D Point Cloud Classification

被引:15
作者
Hoang, Long [1 ]
Lee, Suk-Hwan [2 ]
Lee, Eung-Joo [3 ]
Kwon, Ki-Ryong [1 ]
机构
[1] Pukyong Natl Univ, Dept Artificial Intelligence Convergence, Busan 48513, South Korea
[2] Dong A Univ, Dept Comp Engn, Busan 49315, South Korea
[3] Tongmyong Univ, Div Artificial Intelligence, Busan 48520, South Korea
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 01期
基金
新加坡国家研究基金会;
关键词
Gaussian Supervector representation; enhancing region representation; 3D point cloud classification; deep learning-based approaches; multi-modality-based image processing; computer vision; OBJECT RECOGNITION; FEATURES;
D O I
10.3390/app12010483
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Light Detection and Ranging (LiDAR), which applies light in the formation of a pulsed laser to estimate the distance between the LiDAR sensor and objects, is an effective remote sensing technology. Many applications use LiDAR including autonomous vehicles, robotics, and virtual and augmented reality (VR/AR). The 3D point cloud classification is now a hot research topic with the evolution of LiDAR technology. This research aims to provide a high performance and compatible real-world data method for 3D point cloud classification. More specifically, we introduce a novel framework for 3D point cloud classification, namely, GSV-NET, which uses Gaussian Supervector and enhancing region representation. GSV-NET extracts and combines both global and regional features of the 3D point cloud to further enhance the information of the point cloud features for the 3D point cloud classification. Firstly, we input the Gaussian Supervector description into a 3D wide-inception convolution neural network (CNN) structure to define the global feature. Secondly, we convert the regions of the 3D point cloud into color representation and capture region features with a 2D wide-inception network. These extracted features are inputs of a 1D CNN architecture. We evaluate the proposed framework on the point cloud dataset: ModelNet and the LiDAR dataset: Sydney. The ModelNet dataset was developed by Princeton University (New Jersey, United States), while the Sydney dataset was created by the University of Sydney (Sydney, Australia). Based on our numerical results, our framework achieves more accuracy than the state-of-the-art approaches.
引用
收藏
页数:20
相关论文
共 62 条
  • [1] [Anonymous], 2013, AUSTR C ROB AUT
  • [2] 3DmFV: Three-Dimensional Point Cloud Classification in Real-Time Using Convolutional Neural Networks
    Ben-Shabat, Yizhak
    Lindenbaum, Michael
    Fischer, Anath
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (04): : 3145 - 3152
  • [3] Bruna J, 2013, ARXIV
  • [4] Multispectral LiDAR Point Cloud Classification: A Two-Step Approach
    Chen, Biwu
    Shi, Shuo
    Gong, Wei
    Zhang, Qingjun
    Yang, Jian
    Du, Lin
    Sun, Jia
    Zhang, Zhenbing
    Song, Shalei
    [J]. REMOTE SENSING, 2017, 9 (04)
  • [5] Chen S., 2017, 170206397 ARXIV
  • [6] Chen SH, 2017, INT CONF ACOUST SPEE, P2941, DOI 10.1109/ICASSP.2017.7952695
  • [7] Multi-View 3D Object Detection Network for Autonomous Driving
    Chen, Xiaozhi
    Ma, Huimin
    Wan, Ji
    Li, Bo
    Xia, Tian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6526 - 6534
  • [8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [9] Defferrard M, 2016, ADV NEUR IN, V29
  • [10] Drui F., 2018, 180705695 ARXIV