Self-supervised Exclusive Learning for 3D Segmentation with Cross-modal Unsupervised Domain Adaptation

被引：12

作者：

Zhang, Yachao ^{[1
]}

Li, Miaoyu ^{[1
]}

Xie, Yuan ^{[2
]}

Li, Cuihua ^{[1
]}

Wang, Cong ^{[3
]}

Zhang, Zhizhong ^{[2
]}

Qu, Yanyun ^{[1
]}

机构：

[1] Xiamen Univ, Xiamen, Peoples R China

[2] East China Normal Univ, Shanghai, Peoples R China

[3] Huawei Technol, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

上海市自然科学基金; 中国国家自然科学基金;

关键词：

Cross-modality; Semantic segmentation; Unsupervised domain adaptation; Self-supervised exclusive learning; Mixed domain; SEMANTIC SEGMENTATION;

D O I：

10.1145/3503161.3547987

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

2D-3D unsupervised domain adaptation (UDA) tackles the lack of annotations in a new domain by capitalizing the relationship between 2D and 3D data. Existing methods achieve considerable improvements by performing cross-modality alignment in a modality-agnostic way, failing to exploit modality-specific characteristic for modeling complementarity. In this paper, we present self-supervised exclusive learning for cross-modal semantic segmentation under the UDA scenario, which avoids the prohibitive annotation. Specifically, two self-supervised tasks are designed, named "plane-to-spatial" and "discrete-to-textured". The former helps the 2D network branch improve the perception of spatial metrics, and the latter supplements structured texture information for the 3D network branch. In this way, modality-specific exclusive information can be effectively learned, and the complementarity of multi-modality is strengthened, resulting in a robust network to different domains. With the help of the self-supervised tasks supervision, we introduce a mixed domain to enhance the perception of the target domain by mixing the patches of the source and target domain samples. Besides, we propose a domain-category adversarial learning with category-wise discriminators by constructing the category prototypes for learning domain-invariant features. We evaluate our method on various multi-modality domain adaptation settings, where our results significantly outperform both uni-modality and multi-modality state-of-the-art competitors.

引用

页码：3338 / 3346

页数：9

共 54 条

[41]

Tang L., 2020, IEEE T CYBERNETICS

[42]

Thabet Ali, 2020, CVPR WORKSH

[43] Learning to Adapt Structured Output Space for Semantic Segmentation [J].

Tsai, Yi-Hsuan ;

Hung, Wei-Chih ;

Schulter, Samuel ;

Sohn, Kihyuk ;

Yang, Ming-Hsuan ;

Chandraker, Manmohan .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7472-7481

[44] ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation [J].

Tuan-Hung Vu ;

Jain, Himalaya ;

Bucher, Maxime ;

Cord, Matthieu ;

Perez, Patrick .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2512-2521

[45] PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding [J].

Xie, Saining ;

Gu, Jiatao ;

Guo, Demi ;

Qi, Charles R. ;

Guibas, Leonidas ;

Litany, Or .

COMPUTER VISION - ECCV 2020, PT III, 2020, 12348 :574-591

[46]

Xu Qiang, 2020, CVPR, P11621

[47] Weakly Supervised Semantic Point Cloud Segmentation: Towards 10x Fewer Labels [J].

Xu, Xun ;

Lee, Gim Hee .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :13703-13712

[48] Pixel-Level Domain Transfer [J].

Yoo, Donggeun ;

Kim, Namil ;

Park, Sunggyun ;

Paek, Anthony S. ;

Kweon, In So .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :517-532

[49] Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation [J].

Zhang, Yachao ;

Qu, Yanyun ;

Xie, Yuan ;

Li, Zonghao ;

Zheng, Shanshan ;

Li, Cuihua .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :15500-15508

[50]

Zhang YC, 2021, AAAI CONF ARTIF INTE, V35, P3421

← 1 2 3 4 5 6 →