Cross-Domain 3D Model Retrieval Based On Contrastive Learning and Label Propagation

被引：3

作者：

Song, Dan ^{[1
,2
]}

Yang, Yue ^{[1
]}

Nie, Weizhi ^{[1
]}

Li, Xuanya ^{[3
]}

Liu, An-An ^{[1
,2
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China

[2] Hefei Comprehensice Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China

[3] Baidu Inc, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

3D model retrieval; Domain adaptation; Contrastive learning;

D O I：

10.1145/3503161.3548044

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this work, we aim to tackle the task of unsupervised image based 3D model retrieval, where we seek to retrieve unlabeled 3D models that are most visually similar to the 2D query image. Due to the challenging modality gap between 2D images and 3D models, existing mainstream methods adopt domain-adversarial techniques to eliminate the gap, which cannot guarantee category-level alignment that is important for retrieval performance. Recent methods align the class centers of 2D images and 3D models to pay attention to the category-level alignment. However, there still exist two main issues: 1) the category-level alignment is too rough, and 2) the category prediction of unlabeled 3D models is not accurate. To overcome the first problem, we utilize contrastive learning for fine-grained category-level alignment across domains, which pulls both prototypes and samples with the same semantic information closer and pushes those with different semantic information apart. To provide reliable semantic prediction for contrastive learning and also address the second issue, we propose the consistent decision for pseudo labels of 3D models based on both the trained image classifier and label propagation. Experiments are carried out on MI3DOR and MI3DOR-2 datasets, and the results demonstrate the effectiveness of our proposed method.

引用

页数：10

共 47 条

[1]

Abdul-Rashid H., 2018, 11 EUROGRAPHICS WORK, P37

[2]

Abdul-Rashid H., 2019, P EUR WORKSH 3D OBJ, P41

[3] GIFT: Towards Scalable 3D Shape Retrieval [J].

Bai, Song ;

Bai, Xiang ;

Zhou, Zhichao ;

Zhang, Zhaoxiang ;

Tian, Qi ;

Latecki, Longin Jan .

IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) :1257-1271

[4] Industry use of virtual reality in product design and manufacturing: a survey [J].

Berg, Leif P. ;

Vance, Judy M. .

VIRTUAL REALITY, 2017, 21 (01) :1-17

[5]

Caron M, 2020, ADV NEUR IN, V33

[6] Emerging Properties in Self-Supervised Vision Transformers [J].

Caron, Mathilde ;

Touvron, Hugo ;

Misra, Ishan ;

Jegou, Herve ;

Mairal, Julien ;

Bojanowski, Piotr ;

Joulin, Armand .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9630-9640

[7] Improved Techniques for Adversarial Discriminative Domain Adaptation [J].

Chadha, Aaron ;

Andreopoulos, Yiannis .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :2622-2637

[8] VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification [J].

Chen, Songle ;

Zheng, Lintao ;

Zhang, Yan ;

Sun, Zhixin ;

Xu, Kai .

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (12) :3244-3257

[9]

Chen T, 2020, PR MACH LEARN RES, V119

[10] An Empirical Study of Training Self-Supervised Vision Transformers [J].

Chen, Xinlei ;

Xie, Saining ;

He, Kaiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9620-9629

← 1 2 3 4 5 →