Scalable SoftGroup for 3D Instance Segmentation on Point Clouds

被引:8
作者
Vu, Thang [1 ]
Kim, Kookhoi [1 ]
Nguyen, Thanh [1 ]
Luu, Tung M. [1 ]
Kim, Junyeong [2 ]
Yoo, Chang D. [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
[2] Chung Ang Univ, Dept AI, Seoul 06974, South Korea
关键词
Point clouds; point grouping; octree grouping; instance segmentation; object detection; panoptic segmentation;
D O I
10.1109/TPAMI.2023.3326189
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers a network referred to as SoftGroup for accurate and scalable 3D instance segmentation. Existing state-of-the-art methods produce hard semantic predictions followed by grouping instance segmentation results. Unfortunately, errors stemming from hard decisions propagate into the grouping, resulting in poor overlap between predicted instances and ground truth and substantial false positives. To address the abovementioned problems, SoftGroup allows each point to be associated with multiple classes to mitigate the uncertainty stemming from semantic prediction. It also suppresses false positive instances by learning to categorize them as background. Regarding scalability, the existing fast methods require computational time on the order of tens of seconds on large-scale scenes, which is unsatisfactory and far from applicable for real-time. Our finding is that the k-Nearest Neighbor (k-NN) module, which serves as the prerequisite of grouping, introduces a computational bottleneck. SoftGroup is extended to resolve this computational bottleneck, referred to as SoftGroup++. The proposed SoftGroup++ reduces time complexity with octree k-NN and reduces search space with class-aware pyramid scaling and late devoxelization. Experimental results on various indoor and outdoor datasets demonstrate the efficacy and generality of the proposed SoftGroup and SoftGroup++. Their performances surpass the best-performing baseline by a large margin (6% similar to 16%) in terms of AP(50). On datasets with large-scale scenes, SoftGroup++ achieves a 6x speed boost on average compared to SoftGroup. Furthermore, SoftGroup can be extended to perform object detection and panoptic segmentation with nontrivial improvements over existing methods.
引用
收藏
页码:1981 / 1995
页数:15
相关论文
共 15 条
[1]   SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].
Behley, Jens ;
Garbade, Martin ;
Milioto, Andres ;
Quenzel, Jan ;
Behnke, Sven ;
Stachniss, Cyrill ;
Gall, Juergen .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306
[2]  
Chen MD, 2022, Arxiv, DOI [arXiv:2203.09065, DOI 10.48550/ARXIV.2203.09065,2203.09065, DOI 10.48550/ARXIV.2203.09065]
[3]   3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation [J].
Engelmann, Francis ;
Bokeloh, Martin ;
Fathi, Alireza ;
Leibe, Bastian ;
Niessner, Matthias .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9028-9037
[4]   DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution [J].
He, Tong ;
Shen, Chunhua ;
van den Hengel, Anton .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :354-363
[5]  
Kingma Diederik P, 2014, ARXIV PREPRINT ARXIV
[6]   Panoptic Segmentation [J].
Kirillov, Alexander ;
He, Kaiming ;
Girshick, Ross ;
Rother, Carsten ;
Dollar, Piotr .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9396-9405
[7]   PointPillars: Fast Encoders for Object Detection from Point Clouds [J].
Lang, Alex H. ;
Vora, Sourabh ;
Caesar, Holger ;
Zhou, Lubing ;
Yang, Jiong ;
Beijbom, Oscar .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :12689-12697
[8]  
Liu C, 2019, Arxiv, DOI arXiv:1902.04478
[9]  
Loshchilov I., 2017, INT C LEARN REPR, P1769
[10]  
Milioto A, 2019, IEEE INT C INT ROBOT, P4213, DOI 10.1109/IROS40897.2019.8967762