Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories

被引：4

作者：

Cai, Ziyun ^{[1
]}

Jing, Xiao-Yuan ^{[2
]}

Shao, Ling ^{[3
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China

[2] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China

[3] Inception Inst Artificial Intelligence, Dept Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2022年 / 52卷 / 06期

基金：

美国国家科学基金会;

关键词：

Feature extraction; Task analysis; Image color analysis; Training; Image recognition; Neural networks; Deep learning; Domain adaptation (DA); image classification; RGB-D data; unequal categories; RECOGNITION; IMAGES; MODEL;

D O I：

10.1109/TCYB.2020.3032194

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.

引用

页码：4623 / 4635

页数：13

共 24 条

[21] DGPINet-KD: Deep Guided and Progressive Integration Network With Knowledge Distillation for RGB-D Indoor Scene Analysis
Zhou, Wujie
Jian, Bitao
Fang, Meixin
Dong, Xiena
Liu, Yuanyuan
Jiang, Qiuping
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 7844 - 7855
[22] RGB-D Depth-sensor-based Hand Gesture Recognition Using Deep Learning of Depth Images with Shadow Effect Removal for Smart Gesture Communication
Ding, Ing-, Jr.
Zheng, Nai-Wei
SENSORS AND MATERIALS, 2022, 34 (01) : 203 - 216
[23] Multi-Task Foreground-Aware Network with Depth Completion for Enhanced RGB-D Fusion Object Detection Based on Transformer
Pan, Jiasheng
Zhong, Songyi
Yue, Tao
Yin, Yankun
Tang, Yanhao
SENSORS, 2024, 24 (07)
[24] Deep Depth Completion of Low-cost Sensor Indoor RGB-D using Euclidean Distance-based Weighted Loss and Edge-aware Refinement
Castro, Augusto R.
Grassi Jr, Valdir
Ponti, Moacir A.
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 204 - 212

← 1 2 3 →