Self-supervised Exclusive Learning for 3D Segmentation with Cross-modal Unsupervised Domain Adaptation

被引：12

作者：

Zhang, Yachao ^{[1
]}

Li, Miaoyu ^{[1
]}

Xie, Yuan ^{[2
]}

Li, Cuihua ^{[1
]}

Wang, Cong ^{[3
]}

Zhang, Zhizhong ^{[2
]}

Qu, Yanyun ^{[1
]}

机构：

[1] Xiamen Univ, Xiamen, Peoples R China

[2] East China Normal Univ, Shanghai, Peoples R China

[3] Huawei Technol, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

上海市自然科学基金; 中国国家自然科学基金;

关键词：

Cross-modality; Semantic segmentation; Unsupervised domain adaptation; Self-supervised exclusive learning; Mixed domain; SEMANTIC SEGMENTATION;

D O I：

10.1145/3503161.3547987

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

2D-3D unsupervised domain adaptation (UDA) tackles the lack of annotations in a new domain by capitalizing the relationship between 2D and 3D data. Existing methods achieve considerable improvements by performing cross-modality alignment in a modality-agnostic way, failing to exploit modality-specific characteristic for modeling complementarity. In this paper, we present self-supervised exclusive learning for cross-modal semantic segmentation under the UDA scenario, which avoids the prohibitive annotation. Specifically, two self-supervised tasks are designed, named "plane-to-spatial" and "discrete-to-textured". The former helps the 2D network branch improve the perception of spatial metrics, and the latter supplements structured texture information for the 3D network branch. In this way, modality-specific exclusive information can be effectively learned, and the complementarity of multi-modality is strengthened, resulting in a robust network to different domains. With the help of the self-supervised tasks supervision, we introduce a mixed domain to enhance the perception of the target domain by mixing the patches of the source and target domain samples. Besides, we propose a domain-category adversarial learning with category-wise discriminators by constructing the category prototypes for learning domain-invariant features. We evaluate our method on various multi-modality domain adaptation settings, where our results significantly outperform both uni-modality and multi-modality state-of-the-art competitors.

引用

页码：3338 / 3346

页数：9

共 54 条

[1] Joint Supervised and Self-Supervised Learning for 3D Real World Challenges [J].

Alliegro, Antonio ;

Boscaini, Davide ;

Tommasi, Tatiana .

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :6718-6725

[2]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/SPAWDA48812.2019.9019326

[3]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01533

[4]

[Anonymous], 2015, CVPR, DOI DOI 10.1109/ICCV.2015.463

[5]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00444

[6]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01087

[7]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00710

[8]

[Anonymous], 2019, CVPR, DOI DOI 10.1109/CVPR.2019.00503

[9]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.170

[10] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [J].

Behley, Jens ;

Garbade, Martin ;

Milioto, Andres ;

Quenzel, Jan ;

Behnke, Sven ;

Stachniss, Cyrill ;

Gall, Juergen .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9296-9306

← 1 2 3 4 5 6 →