Dual-Stage Uncertainty Modeling for Unsupervised Cross-Domain 3D Model Retrieval

被引：1

作者：

Li, Wenhui ^{[1
]}

Zhou, Houran ^{[1
]}

Zhang, Chenyu ^{[1
]}

Nie, Weizhi ^{[1
]}

Li, Xuanya ^{[2
]}

Liu, An-An ^{[1
]}

机构：

[1] Tianjin Univ, Tianjin 300072, Peoples R China

[2] Baidu Inc, Beijing 100089, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

基金：

中国国家自然科学基金;

关键词：

Solid modeling; Uncertainty; Three-dimensional displays; Semantics; Prototypes; Gaussian distribution; Bicycles; Cross-domain learning; 3D model retrieval; uncertainty encoding; domain adaptation; ALIGNMENT;

D O I：

10.1109/TMM.2024.3384675

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Unsupervised cross-domain 3D model retrieval aims to retrieve unlabeled 3D models (target domain) using labeled 2D images (source domain). Domain adaptation approaches have shown impressive performance for cross-domain 3D model retrieval. However, conventional methods typically represent samples from different domains as deterministic points, overlooking the diversity in sample characteristics and relationships. These approaches lead to challenges in achieving a robust representation of both samples and categories. To address above challenges, we propose a dual-stage uncertainty modeling (DSUM) for unsupervised cross-domain 3D model retrieval, which utilizes Gaussian distribution to effectively model the uncertainty characteristics in both sample and class and obtain the robust and domain-invariant representations. Specifically, in the multi-view uncertainty encoding stage, we discard the conventional pooling operations and utilize the uncertainty modeling among multiple views to fuse the common and specific information of 2D images and 3D models. In the cross-domain feature alignment stage, we adopt the Gaussian distribution of samples belonging to the same category, which can well maintain the sample diversity as well as facilitate to eliminate the domain discrepancy. Our method achieves improvements of 2.61% and 2.65% in terms of FT on two cross-domain datasets, respectively, verifying its superiority through extensive qualitative and quantitative experiments.

引用

页码：8996 / 9007

页数：12

共 55 条

[1] Alemi A. A., 2017, P INT C LEARN REPR, P1
[2] A Dual-Stage Semi-Supervised Pre-Training Approach for Medical Image Segmentation
Aralikatti R.C.
Pawan S.J.
Rajan J.
[J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (02): : 556 - 565
[3] Understanding deep features with computer-generated imagery
Aubry, Mathieu
Russell, Bryan C.
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2875 - 2883
[4] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
Bousmalis, Konstantinos
Silberman, Nathan
Dohan, David
Erhan, Dumitru
Krishnan, Dilip
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 95 - 104
[5] VGGFace2: A dataset for recognising faces across pose and age
Cao, Qiong
Shen, Li
Xie, Weidi
Parkhi, Omkar M.
Zisserman, Andrew
[J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 67 - 74
[6] Data Uncertainty Learning in Face Recognition
Chang, Jie
Lan, Zhonghao
Cheng, Changmao
Wei, Yichen
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5709 - 5718
[7] Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving
Choi, Jiwoong
Chun, Dayoung
Kim, Hyun
Lee, Hyuk-Jae
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 502 - 511
[8] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9] Gal Y, 2016, PR MACH LEARN RES, V48
[10] Ganin Y, 2015, PR MACH LEARN RES, V37, P1180

← 1 2 3 4 5 6 →