Superpoint Transformer for 3D Scene Instance Segmentation

被引：0

作者：

Sun, Jiahao ^{[1
]}

Qing, Chunmei ^{[1
]}

Tan, Junpeng ^{[1
]}

Xu, Xiangmin ^{[2
]}

机构：

[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Peoples R China

[2] South China Univ Technol, Sch Future Technol, Guangzhou, Peoples R China

来源：

THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2 | 2023年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most existing methods realize 3D instance segmentation by extending those models used for 3D object detection or 3D semantic segmentation. However, these non-straight-forward methods suffer from two drawbacks: 1) Imprecise bounding boxes or unsatisfactory semantic predictions limit the performance of the overall 3D instance segmentation framework. 2) Existing methods require a time-consuming intermediate step of aggregation. To address these issues, this paper proposes a novel end-to-end 3D instance segmentation method based on Superpoint Transformer, named as SPFormer. It groups potential features from point clouds into superpoints, and directly predicts instances through query vectors without relying on the results of object detection or semantic segmentation. The key step in this framework is a novel query decoder with transformers that can capture the instance information through the superpoint cross-attention mechanism and generate the superpoint masks of the instances. Through bipartite matching based on superpoint masks, SPFormer can implement the network training without the intermediate aggregation step, which accelerates the network. Extensive experiments on ScanNetv2 and S3DIS benchmarks verify that our method is concise yet efficient. Notably, SPFormer exceeds compared state-of-the-art methods by 4.3% on Scan-Netv2 hidden test set in terms of mAP and keeps fast inference speed (247ms per frame) simultaneously. Code is available at https://github.com/sunjiahao1999/SPFormer.

引用

页码：2393 / 2401

页数：9

共 50 条

[21] Neural Segmentation Field in 3D Scene [J].

Huang, Tsung-Wei ;

Tu, Peihan ;

Su, Guan-Ming .

FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, :1141-1145

[22] Scene segmentation from 3D motion [J].

Feng, XL ;

Perona, P .

1998 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1998, :225-231

[23] Point Cloud Instance Segmentation Method Based on Superpoint Graph [J].

Wang Z. ;

Yu Z. ;

Wei G. ;

Sun Y. .

Tongji Daxue Xuebao/Journal of Tongji University, 2020, 48 (09) :1377-1384

[24] Motion analysis and segmentation in 3D scene [J].

Zhang, J. ;

Zhu, G. ;

Liu, W. ;

Liu, D. .

Huazhong Ligong Daxue Xuebao/Journal Huazhong (Central China) University of Science and Technology, 2001, 29 (08) :26-28

[25] SoftGroup for 3D Instance Segmentation on Point Clouds [J].

Thang Vu ;

Kim, Kookhoi ;

Luu, Tung M. ;

Thanh Nguyen ;

Yoo, Chang D. .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :2698-2707

[26] Learning 3D Semantic Scene Graphs with Instance Embeddings [J].

Johanna Wald ;

Nassir Navab ;

Federico Tombari .

International Journal of Computer Vision, 2022, 130 :630-651

[27] Learning 3D Semantic Scene Graphs with Instance Embeddings [J].

Wald, Johanna ;

Navab, Nassir ;

Tombari, Federico .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (03) :630-651

[28] SGIFormer: Semantic-Guided and Geometric-Enhanced Interleaving Transformer for 3D Instance Segmentation [J].

Yao, Lei ;

Wang, Yi ;

Liu, Moyun ;

Chau, Lap-Pui .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (03) :2276-2288

[29] Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation [J].

Wu, Yizheng ;

Pan, Zhiyu ;

Wang, Kewei ;

Li, Xingyi ;

Cui, Jiahao ;

Xiao, Liwen ;

Lin, Guosheng ;

Cao, Zhiguo .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) :9567-9582

[30] Uncertainty-Aware Superpoint Graph Transformer for Weakly Supervised 3-D Semantic Segmentation [J].

Fan, Yan ;

Wang, Yu ;

Zhu, Pengfei ;

Hui, Le ;

Xie, Jin ;

Hu, Qinghua .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2025, 33 (06) :1899-1912

← 1 2 3 4 5 →