Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories

被引：4

作者：

Cai, Ziyun ^{[1
]}

Jing, Xiao-Yuan ^{[2
]}

Shao, Ling ^{[3
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China

[2] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China

[3] Inception Inst Artificial Intelligence, Dept Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2022年 / 52卷 / 06期

基金：

美国国家科学基金会;

关键词：

Feature extraction; Task analysis; Image color analysis; Training; Image recognition; Neural networks; Deep learning; Domain adaptation (DA); image classification; RGB-D data; unequal categories; RECOGNITION; IMAGES; MODEL;

D O I：

10.1109/TCYB.2020.3032194

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.

引用

页码：4623 / 4635

页数：13

共 24 条

[1] Deep RGB-D Saliency Detection Without Depth
Zhang, Yuan-fang
Zheng, Jiangbin
Jia, Wenjing
Huang, Wenfeng
Li, Long
Liu, Nian
Li, Fei
He, Xiangjian
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 755 - 767
[2] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
Jin, Wen-Da
Xu, Jun
Han, Qi
Zhang, Yi
Cheng, Ming-Ming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
[3] Learning to Weight Color and Depth for RGB-D Visual Search
Petrelli, Alioscia
Di Stefano, Luigi
IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 648 - 659
[4] Enhancing Visual Odometry with Estimated Scene Depth: Leveraging RGB-D Data with Deep Learning
Kostusiak, Aleksander
Skrzypczynski, Piotr
ELECTRONICS, 2024, 13 (14)
[5] Domain emb e dding transfer for unequal RGB-D image recognition
Cai, Ziyun
Jing, Xiao-Yuan
Shao, Ling
PATTERN RECOGNITION, 2023, 143
[6] Unsupervised Domain Adaptation Learning Algorithm for RGB-D Stairway Recognition
Jing WANG
Kuangen ZHANG
Instrumentation, 2019, 6 (02) : 21 - 29
[7] Adaptive Depth Enhancement Network for RGB-D Salient Object Detection
Yi, Kang
Li, Yumeng
Tang, Haoran
Xu, Jing
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 176 - 180
[8] SiaTrans: Siamese transformer network for RGB-D salient object detection with depth image classification
Jia, XingZhao
DongYe, ChangLei
Peng, YanJun
IMAGE AND VISION COMPUTING, 2022, 127
[9] DDaNet: Dual-Path Depth-Aware Attention Network for Fingerspelling Recognition Using RGB-D Images
Yang, Shih-Hung
Chen, Wei-Ren
Huang, Wun-Jhu
Chen, Yon-Ping
IEEE ACCESS, 2021, 9 (09): : 7306 - 7322
[10] Multi-modal deep network for RGB-D segmentation of clothes
Joukovsky, B.
Hu, P.
Munteanu, A.
ELECTRONICS LETTERS, 2020, 56 (09) : 432 - 434

← 1 2 3 →