Cross-modal Unsupervised Domain Adaptation for 3D Semantic Segmentation via Bidirectional Fusion-then-Distillation

被引:1
作者
Wu, Yao [1 ]
Xing, Mingwei [2 ]
Zhang, Yachao [3 ]
Xie, Yuan [4 ,5 ]
Fan, Jianping [6 ]
Shi, Zhongchao [6 ]
Qu, Yanyun [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen, Peoples R China
[2] Xiamen Univ, Inst Artificial Intelligence, Xiamen, Peoples R China
[3] Tsinghua Univ, Shenzhen, Peoples R China
[4] East China Normal Univ, Shanghai, Peoples R China
[5] East China Normal Univ, Chongqing Inst, Chongqing, Peoples R China
[6] Lenovo Res, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
3D semantic segmentation; Unsupervised domain adaptation;
D O I
10.1145/3581783.3612013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-modal Unsupervised Domain Adaptation (UDA) becomes a research hotspot because it reduces the laborious annotation of target domain samples. Existing methods only mutually mimic the outputs of cross-modality in each domain, which enforces the class probability distribution agreeable in different domains. However, these methods ignore the complementarity brought by the modality fusion representation in cross-modal learning. In this paper, we propose a cross-modal UDA method for 3D semantic segmentation via Bidirectional Fusion-then-Distillation, named BFtD-xMUDA, which explores cross-modal fusion in UDA and realizes distribution consistency between outputs of two domains not only for 2D image and 3D point cloud but also for 2D/3D and fusion. Our method contains three significant components: Model-agnostic Feature Fusion Module (MFFM), Bidirectional Distillation (B-Distill), and Cross-modal Debiased Pseudo-Labeling (xDPL). MFFM is employed to generate cross-modal fusion features for establishing a latent space, which enforces maximum correlation and complementarity between two heterogeneous modalities. B-Distill is introduced to exploit bidirectional knowledge distillation which includes cross-modality and cross-domain fusion distillation, and well-achieving domain-modality alignment. xDPL is designed to model the uncertainty of pseudo-labels by self-training scheme. Extensive experimental results demonstrate that our method outperforms state-of-the-art competitors in several adaptation scenarios.
引用
收藏
页码:490 / 498
页数:9
相关论文
共 44 条
  • [1] Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation
    Li, Miaoyu
    Zhang, Yachao
    Xie, Yuan
    Gao, Zuodong
    Li, Cuihua
    Zhang, Zhizhong
    Qu, Yanyun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3829 - 3837
  • [2] Self-supervised Exclusive Learning for 3D Segmentation with Cross-modal Unsupervised Domain Adaptation
    Zhang, Yachao
    Li, Miaoyu
    Xie, Yuan
    Li, Cuihua
    Wang, Cong
    Zhang, Zhizhong
    Qu, Yanyun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3338 - 3346
  • [3] Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation
    Takashima, Yuki
    Takashima, Ryoichi
    Tsunoda, Ryota
    Aihara, Ryo
    Takiguchi, Tetsuya
    Ariki, Yasuo
    Motoyama, Nobuaki
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [4] Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation
    Yuki Takashima
    Ryoichi Takashima
    Ryota Tsunoda
    Ryo Aihara
    Tetsuya Takiguchi
    Yasuo Ariki
    Nobuaki Motoyama
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [5] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PATTERN RECOGNITION, 2023, 137
  • [6] Cross-Modal and Cross-Domain Knowledge Transfer for Label-Free 3D Segmentation
    Zhang, Jingyu
    Yang, Huitong
    Wu, Dai-Jie
    Keung, Jacky
    Li, Xuesong
    Zhu, Xinge
    Ma, Yuexin
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT III, 2024, 14427 : 465 - 477
  • [7] Unsupervised cross domain semantic segmentation with mutual refinement and information distillation
    Ren, Dexin
    Wang, Shidong
    Zhang, Zheng
    Yang, Wankou
    Ren, Mingwu
    Zhang, Haofeng
    NEUROCOMPUTING, 2024, 586
  • [8] BiFDANet: Unsupervised Bidirectional Domain Adaptation for Semantic Segmentation of Remote Sensing Images
    Cai, Yuxiang
    Yang, Yingchun
    Zheng, Qiyi
    Shen, Zhengwei
    Shang, Yongheng
    Yin, Jianwei
    Shi, Zhongtian
    REMOTE SENSING, 2022, 14 (01)
  • [9] Unsupervised domain adaptation via style adaptation and boundary enhancement for medical semantic segmentation
    Ge, Yisu
    Chen, Zhao-Min
    Zhang, Guodao
    Heidari, Ali Asghar
    Chen, Huiling
    Teng, Shu
    NEUROCOMPUTING, 2023, 550
  • [10] A Novel 3D Unsupervised Domain Adaptation Framework for Cross-Modality Medical Image Segmentation
    Yao, Kai
    Su, Zixian
    Huang, Kaizhu
    Yang, Xi
    Sun, Jie
    Hussain, Amir
    Coenen, Frans
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (10) : 4976 - 4986