Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories

被引:4
|
作者
Cai, Ziyun [1 ]
Jing, Xiao-Yuan [2 ]
Shao, Ling [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China
[2] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China
[3] Inception Inst Artificial Intelligence, Dept Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
美国国家科学基金会;
关键词
Feature extraction; Task analysis; Image color analysis; Training; Image recognition; Neural networks; Deep learning; Domain adaptation (DA); image classification; RGB-D data; unequal categories; RECOGNITION; IMAGES; MODEL;
D O I
10.1109/TCYB.2020.3032194
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.
引用
收藏
页码:4623 / 4635
页数:13
相关论文
共 24 条
  • [21] DGPINet-KD: Deep Guided and Progressive Integration Network With Knowledge Distillation for RGB-D Indoor Scene Analysis
    Zhou, Wujie
    Jian, Bitao
    Fang, Meixin
    Dong, Xiena
    Liu, Yuanyuan
    Jiang, Qiuping
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 7844 - 7855
  • [22] RGB-D Depth-sensor-based Hand Gesture Recognition Using Deep Learning of Depth Images with Shadow Effect Removal for Smart Gesture Communication
    Ding, Ing-, Jr.
    Zheng, Nai-Wei
    SENSORS AND MATERIALS, 2022, 34 (01) : 203 - 216
  • [23] Multi-Task Foreground-Aware Network with Depth Completion for Enhanced RGB-D Fusion Object Detection Based on Transformer
    Pan, Jiasheng
    Zhong, Songyi
    Yue, Tao
    Yin, Yankun
    Tang, Yanhao
    SENSORS, 2024, 24 (07)
  • [24] Deep Depth Completion of Low-cost Sensor Indoor RGB-D using Euclidean Distance-based Weighted Loss and Edge-aware Refinement
    Castro, Augusto R.
    Grassi Jr, Valdir
    Ponti, Moacir A.
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 204 - 212