Deep Projective 3D Semantic Segmentation

被引：275

作者：

Lawin, Felix Jaremo ^{[1
]}

Danelljan, Martin ^{[1
]}

Tosteberg, Patrik ^{[1
]}

Bhat, Goutam ^{[1
]}

Khan, Fahad Shahbaz ^{[1
]}

Felsberg, Michael ^{[1
]}

机构：

[1] Linkoping Univ, Dept Elect Engn, Comp Vis Lab, Linkoping, Sweden

来源：

COMPUTER ANALYSIS OF IMAGES AND PATTERNS | 2017年 / 10424卷

基金：

瑞典研究理事会;

关键词：

Point clouds; Semantic segmentation; Deep learning; Multi-stream deep networks;

D O I：

10.1007/978-3-319-64689-3_8

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Semantic segmentation of 3D point clouds is a challenging problem with numerous real-world applications. While deep learning has revolutionized the field of image semantic segmentation, its impact on point cloud data has been limited so far. Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results. Such methods require voxelizations of the underlying point cloud data, leading to decreased spatial resolution and increased memory consumption. Additionally, 3D-CNNs greatly suffer from the limited availability of annotated datasets. In this paper, we propose an alternative framework that avoids the limitations of 3D-CNNs. Instead of directly solving the problem in 3D, we first project the point cloud onto a set of synthetic 2D-images. These images are then used as input to a 2D-CNN, designed for semantic segmentation. Finally, the obtained prediction scores are re-projected to the point cloud to obtain the segmentation results. We further investigate the impact of multiple modalities, such as color, depth and surface normals, in a multi-stream network architecture. Experiments are performed on the recent Semantic3D dataset. Our approach sets a new state-of-theart by achieving a relative gain of 7.9%, compared to the previous best approach.

引用

页码：95 / 107

页数：13

共 29 条

[1]

Anguelov D, 2005, PROC CVPR IEEE, P169

[2]

[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298801

[3]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[4]

Eitel A, 2015, IEEE INT C INT ROBOT, P681, DOI 10.1109/IROS.2015.7353446

[5] Convolutional Two-Stream Network Fusion for Video Action Recognition [J].

Feichtenhofer, Christoph ;

Pinz, Axel ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1933-1941

[6] Learning Rich Features from RGB-D Images for Object Detection and Segmentation [J].

Gupta, Saurabh ;

Girshick, Ross ;

Arbelaez, Pablo ;

Malik, Jitendra .

COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 :345-360

[7]

Hackel T., 2016, ISPRS ANN ISPRS C PR

[8]

Hackel T., 2017, ISPRS Ann. Photogramm., Remote Sens. Spatial Inf. Sci., VIV-1/W1, P91, DOI 10.5194/isprs-annals-IV-1-W1-91-2017

[9] FAST SEMANTIC SEGMENTATION OF 3D POINT CLOUDS WITH STRONGLY VARYING DENSITY [J].

Hackel, Timo ;

Wegner, Jan D. ;

Schindler, Konrad .

XXIII ISPRS CONGRESS, COMMISSION III, 2016, 3 (03) :177-184

[10]

Huang Jing, 2016, INT C PATT REC ICPR, P2

← 1 2 3 →