3d indoor point cloud semantic segmentation using image and voxel

被引:0
作者
Yeom S.-S. [1 ]
Ha J.-E. [2 ]
机构
[1] Graduate School of Automotive Engineering, Seoul National University of Science and Technology
[2] Department of Mechanical and Automotive Engineering, Seoul National University of Science and Technology
来源
Ha, Jong-Eun (jeha@seoultech.ac.kr) | 1600年 / Institute of Control, Robotics and Systems卷 / 27期
关键词
3D Vision; Point Cloud; Semantic Segmentation;
D O I
10.5302/J.ICROS.2021.21.0142
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a parallel network architecture that exhibits improved performance by fusing two-dimensional (2D) and three-dimensional (3D) features. A voxel-based and a projection-based method were adopted to derive the results through one scan. Our approach consists of two parallel networks, extracts features along each dimension, and converges them in a fusion network. In the fusion network, the voxel blocks and 2D feature maps extracted from each structure are fused to the voxel grid and then trained through convolution. For effective training of 2D networks, we use data augmentation techniques using coordinate system rotation transformation. In addition, a multi-loss with weights applied to each dimension was employed to effectively enhance the performance of the system, and the results revealed that the system exhibited better performance than when a single loss was used. Our proposed method can achieve better performance by changing the performance of the 2D network and 3D network, which can be generalized using other structures. © ICROS 2021.
引用
收藏
页码:1000 / 1007
页数:7
相关论文
共 24 条
[11]  
Choy C., Gwak J.-Y., Savarese S., “4d spatio-temporal convnets: Minkowski convolutional neural networks, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3075-3084, (2019)
[12]  
Liu Z., Tang H.Y.S., Point-voxel cnn for efficient 3d deep learning, Arxiv, 1907, (2019)
[13]  
Cicek O., Abdulkadir A., Lienkamp S.S., Brox T., Ronneberger O., 3D U-Net: Learning dense volumetric segmentation from sparse annotation, International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 424-432, (2016)
[14]  
Wu B., Wan A., Yue X., Keutzer K., Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud, IEEE International Conference on Robotics and Automation (ICRA), pp. 1887-1893, (2018)
[15]  
Wu B., Zhou Z., Zhao S., Yue X., Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud, International Conference on Robotics and Automation (ICRA), pp. 4376-4382, (2019)
[16]  
SalsaNext: Fast, uncertainty-aware semantic segmentation of LiDAR point clouds for autonomous driving,”, Arxiv, 2003, (2020)
[17]  
Milioto A., Vizzo I., Behley J., Rangenet++: Fast and accurate lidar semantic segmentation, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4213-4220, (2019)
[18]  
Xu C., Wu B., Wang Z., Zhan W., Vajda P., Keutzer K., Tomizuka M., Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation, European Conference on Computer Vision, Springer, Cham, pp. 1-19, (2020)
[19]  
Tchapmi L., Choy C., Armeni I., Gwak J.-Y., Savarese S., Segcloud: Semantic segmentation of 3d point clouds, 2017 International Conference on 3D Vision (3DV), pp. 537-547, (2017)
[20]  
Tatarchenko M., Park J., Koltun V., Zhou Q.-Y., Tangent convolutions for dense prediction in 3d, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3887-3896, (2018)