Universal unsupervised cross-domain 3D shape retrieval

被引:0
作者
Heyu Zhou
Fan Wang
Qipei Liu
Jiayu Li
Wen Liu
Xuanya Li
An-An Liu
机构
[1] Yichang Testing Technique R &D Institute,Institute of Artificial Intelligence
[2] Hefei Comprehensive National Science Center,School of Electrical and Information Engineering
[3] Tianjin University,School of Navigation
[4] Baidu Inc.,undefined
[5] Tianjin Shengtong Technology Development Co.,undefined
[6] Ltd,undefined
[7] Wuhan University of Technology,undefined
来源
Multimedia Systems | 2024年 / 30卷
关键词
3D shape retrieval; Cross-domain retrieval; Multi-view representation learning; Multi-source domain adaptation;
D O I
暂无
中图分类号
学科分类号
摘要
Most existing cross-domain 3D shape retrieval (CD3DSR) methods have assumed the setting of a fixed kind of query set (source domain), and all the annotated query data follow the same distribution. However, in practical scenarios, the labelled query sets are typically collected from multiple sources. In such scenarios, single-source CD3DSR methods may fail because of the domain shift across different sources, and universal CD3DSR methods are needed. In this paper, we propose a novel universal unsupervised domain adaptation network (U2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document}DAN). It mainly consists of two modules: cross-domain statistics alignment (CDSA) and source-domain feature adaptation (SDFA). First, we use 2D CNN to encode the query and 3D shape from the gallery (target domain) to obtain visual features. To mix up the features between each source and target domain pair, we introduce the margin disparity discrepancy (MDD) model to enforce the domain alignment in an adversarial way. Since the domain shifts also exist across different sources, which may result in performance degradation, we introduce two kinds of discriminators, source-domain discriminator, and cycle cross-domain discriminator to reduce source domain bias. Further, considering there are no available 3D datasets for evaluation, we constructed two novel datasets, MS3DOR-1 for universal cross-dataset 3D shape retrieval (3D-to-3D) and MS3DOR-2 for universal cross-modal 3D shape retrieval (2D-to-3D). Extensive comparisons on two datasets can verify the effectiveness of U2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{2}$$\end{document}DAN against the state-of-art methods.
引用
收藏
相关论文
共 50 条
[1]  
Xiong B(2022)A unified framework for multi-modal federated learning Neurocomputing 480 110-118
[2]  
Yang X(2021)A pairwise attentive adversarial spatiotemporal network for cross-domain few-shot action recognition-r2 IEEE Trans. Image Process. 30 767-782
[3]  
Qi F(2022)Unbiased feature enhancement framework for cross-modality person re-identification Multim. Syst. 28 749-759
[4]  
Xu C(2022)The model may fit you: user-generalized cross-modal retrieval IEEE Trans. Multim. 24 2998-3012
[5]  
Gao Z(2022)Dualvgr: a dual-visual graph reasoning unit for video question answering IEEE Trans. Multim. 24 3369-3380
[6]  
Guo L(2022)Prototype local-global alignment network for image-text retrieval Int. J. Multim. Inf. Retr. 11 525-538
[7]  
Guan W(2022)Joint local correlation and global contextual information for unsupervised 3d model retrieval and classification IEEE Trans. Circuits Syst. Video Technol. 32 3265-3278
[8]  
Liu A(2022)Semantically guided projection for zero-shot 3d model classification and retrieval Multim. Syst. 28 2437-2451
[9]  
Ren T(2020)Multi-view saliency guided deep neural network for 3-d object retrieval and classification IEEE Trans. Multim. 22 1496-1506
[10]  
Chen S(1998)Illumination for computer generated pictures Commun. ACM 18 311-317