3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation

被引：169

作者：

Engelmann, Francis ^{[1
,2
]}

Bokeloh, Martin ^{[2
]}

Fathi, Alireza ^{[2
]}

Leibe, Bastian ^{[1
]}

Niessner, Matthias ^{[3
]}

机构：

[1] Rhein Westfal TH Aachen, Aachen, Germany

[2] Google, Mountain View, CA 94043 USA

[3] Tech Univ Munich, Munich, Germany

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

关键词：

D O I：

10.1109/CVPR42600.2020.00905

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present 3D-MPA, a method for instance segmentation on 3D point clouds. Given an input point cloud, we propose an object-centric approach where each point votes for its object center. We sample object proposals from the predicted object centers. Then, we learn proposal features from grouped point features that voted for the same object center. A graph convolutional network introduces interproposal relations, providing higher-level feature learning in addition to the lower-level point features. Each proposal comprises a semantic label, a set of associated points over which we define a foreground-background mask, an objectness score and aggregation features. Previous works usually perform non-maximum-suppression (NMS) over proposals to obtain the final object detections or semantic instances. However, NMS can discard potentially correct predictions. Instead, our approach keeps all proposals and groups them together based on the learned aggregation features. We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset.

引用

页码：9028 / 9037

页数：10

共 50 条

[1] 3D Scene Graph: A structure for unified semantics, 3D space, and camera [J].

Armeni, Iro ;

He, Zhi-Yang ;

Gwak, JunYoung ;

Zamir, Amir R. ;

Fischer, Martin ;

Malik, Jitendra ;

Savarese, Silvio .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :5663-5672

[2] 3D Semantic Parsing of Large-Scale Indoor Spaces [J].

Armeni, Iro ;

Sener, Ozan ;

Zamir, Amir R. ;

Jiang, Helen ;

Brilakis, Ioannis ;

Fischer, Martin ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543

[3]

Behl Aseem, 2019, IEEE C COMP VIS PATT

[4]

Brabandere B. D., 2017, IEEE C COMP VIS PATT

[5] Remote-mode microsphere nano-imaging: new boundaries for optical microscopes [J].

Chen, Lianwei ;

Zhou, Yan ;

Wu, Mengxue ;

Hong, Minghui .

OPTO-ELECTRONIC ADVANCES, 2018, 1 (01) :1-7

[6] 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks [J].

Choy, Christopher ;

Gwak, JunYoung ;

Savarese, Silvio .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3070-3079

[7] 3DMV: Joint 3D-Multi-view Prediction for 3D Semantic Scene Segmentation [J].

Dai, Angela ;

Niessner, Matthias .

COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :458-474

[8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[9] ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes [J].

Dai, Angela ;

Chang, Angel X. ;

Savva, Manolis ;

Halber, Maciej ;

Funkhouser, Thomas ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2432-2443

[10]

Dai Angela, 2017, ACM T GRAPHICS TOG

← 1 2 3 4 5 →