Image Understands Point Cloud: Weakly Supervised 3D Semantic Segmentation via Association Learning

被引：7

作者：

Sun, Tianfang ^{[1
]}

Zhang, Zhizhong ^{[1
]}

Tan, Xin ^{[1
,2
]}

Qu, Yanyun ^{[3
]}

Xie, Yuan ^{[1
,2
]}

机构：

[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai 200060, Peoples R China

[2] East China Normal Univ, Chongqing Inst, Chongqing 401333, Peoples R China

[3] Xiamen Univ, Sch Informat, Dept Comp Sci & Technol, Xiamen 361005, Fujian, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

基金：

中国国家自然科学基金;

关键词：

Point cloud compression; Three-dimensional displays; Labeling; Laser radar; Annotations; Training; Semantic segmentation; Multimodal; weakly supervised; point cloud semantic segmentation;

D O I：

10.1109/TIP.2024.3372449

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly supervised point cloud semantic segmentation methods that require 1% or fewer labels with the aim of realizing almost the same performance as fully supervised approaches have recently attracted extensive research attention. A typical solution in this framework is to use self-training or pseudo-labeling to mine the supervision from the point cloud itself while ignoring the critical information from images. In fact, cameras widely exist in LiDAR scenarios, and this complementary information seems to be highly important for 3D applications. In this paper, we propose a novel cross-modality weakly supervised method for 3D segmentation that incorporates complementary information from unlabeled images. We design a dual-branch network equipped with an active labeling strategy to maximize the power of tiny parts of labels and to directly realize 2D-to-3D knowledge transfer. Afterward, we establish a cross-modal self-training framework, which iterates between parameter updating and pseudolabel estimation. In the training phase, we propose cross-modal association learning to mine complementary supervision from images by reinforcing the cycle consistency between 3D points and 2D superpixels. In the pseudolabel estimation phase, a pseudolabel self-rectification mechanism is derived to filter noisy labels, thus providing more accurate labels for the networks to be fully trained. The extensive experimental results demonstrate that our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.

引用

页码：1838 / 1852

页数：15

共 59 条

[1] SLIC Superpixels Compared to State-of-the-Art Superpixel Methods
Achanta, Radhakrishna
Shaji, Appu
Smith, Kevin
Lucchi, Aurelien
Fua, Pascal
Suesstrunk, Sabine
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) : 2274 - 2281
[2] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
Behley, Jens
Garbade, Martin
Milioto, Andres
Quenzel, Jan
Behnke, Sven
Stachniss, Cyrill
Gall, Juergen
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9296 - 9306
[3] The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
Berman, Maxim
Triki, Amal Rannen
Blaschko, Matthew B.
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4413 - 4421
[4] Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[5] Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs
Cai, Lile
Xu, Xun
Liew, Jun Hao
Foo, Chuan Sheng
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10983 - 10992
[6] Campello Ricardo J. G. B., 2013, Advances in Knowledge Discovery and Data Mining. 17th Pacific-Asia Conference (PAKDD 2013). Proceedings, P160, DOI 10.1007/978-3-642-37456-2_14
[7] Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, DOI 10.48550/ARXIV.1706.05587]
[8] Chen YJ, 2024, Arxiv, DOI arXiv:2312.08234
[9] Cortinhal Tiago, 2020, Advances in Visual Computing. 15th International Symposium, ISVC 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12510), P207, DOI 10.1007/978-3-030-64559-5_16
[10] El Madawi K, 2019, IEEE INT C INTELL TR, P7, DOI [10.1109/itsc.2019.8917447, 10.1109/ITSC.2019.8917447]

← 1 2 3 4 5 6 →