Visual-Depth Matching Network: Deep RGB-D Domain Adaptation With Unequal Categories

被引:4
|
作者
Cai, Ziyun [1 ]
Jing, Xiao-Yuan [2 ]
Shao, Ling [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210023, Peoples R China
[2] Wuhan Univ, Sch Comp, Wuhan 430072, Peoples R China
[3] Inception Inst Artificial Intelligence, Dept Artificial Intelligence, Abu Dhabi, U Arab Emirates
基金
美国国家科学基金会;
关键词
Feature extraction; Task analysis; Image color analysis; Training; Image recognition; Neural networks; Deep learning; Domain adaptation (DA); image classification; RGB-D data; unequal categories; RECOGNITION; IMAGES; MODEL;
D O I
10.1109/TCYB.2020.3032194
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing domain adaptation (DA) methods generally assume that different domains have identical label space, and the training data are only sampled from a single domain. This unrealistic assumption is quite restricted for real-world applications, since it neglects the more practical scenario, where the source domain can contain the categories that are not shared by the target domain, and the training data can be collected from multiple modalities. In this article, we address a more difficult but practical problem, which recognizes RGB images through training on RGB-D data under the label space inequality scenario. There are three challenges in this task: 1) source and target domains are affected by the domain mismatch issue, which results in that the trained models perform imperfectly on the test data; 2) depth images are absent in the target domain (e.g., target images are captured by smartphones), when the source domain contains both the RGB and depth data. It makes the ordinary visual recognition approaches hardly applied to this task; and 3) in the real world, the source and target domains always have different numbers of categories, which would result in a negative transfer bottleneck being more prominent. Toward tackling the above challenges, we formulate a deep model, called visual-depth matching network (VDMN), where two new modules and a matching component can be trained in an end-to-end fashion jointly to identify the common and outlier categories effectively. The significance of VDMN is that it can take advantage of depth information and handle the domain distribution mismatch under label inequality simultaneously. The experimental results reveal that VDMN exceeds the state-of-the-art performance on various DA datasets, especially under the label inequality scenario.
引用
收藏
页码:4623 / 4635
页数:13
相关论文
共 24 条
  • [1] Deep RGB-D Saliency Detection Without Depth
    Zhang, Yuan-fang
    Zheng, Jiangbin
    Jia, Wenjing
    Huang, Wenfeng
    Li, Long
    Liu, Nian
    Li, Fei
    He, Xiangjian
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 755 - 767
  • [2] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
    Jin, Wen-Da
    Xu, Jun
    Han, Qi
    Zhang, Yi
    Cheng, Ming-Ming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
  • [3] Learning to Weight Color and Depth for RGB-D Visual Search
    Petrelli, Alioscia
    Di Stefano, Luigi
    IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 648 - 659
  • [4] Enhancing Visual Odometry with Estimated Scene Depth: Leveraging RGB-D Data with Deep Learning
    Kostusiak, Aleksander
    Skrzypczynski, Piotr
    ELECTRONICS, 2024, 13 (14)
  • [5] Domain emb e dding transfer for unequal RGB-D image recognition
    Cai, Ziyun
    Jing, Xiao-Yuan
    Shao, Ling
    PATTERN RECOGNITION, 2023, 143
  • [6] Unsupervised Domain Adaptation Learning Algorithm for RGB-D Stairway Recognition
    Jing WANG
    Kuangen ZHANG
    Instrumentation, 2019, 6 (02) : 21 - 29
  • [7] Adaptive Depth Enhancement Network for RGB-D Salient Object Detection
    Yi, Kang
    Li, Yumeng
    Tang, Haoran
    Xu, Jing
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 176 - 180
  • [8] SiaTrans: Siamese transformer network for RGB-D salient object detection with depth image classification
    Jia, XingZhao
    DongYe, ChangLei
    Peng, YanJun
    IMAGE AND VISION COMPUTING, 2022, 127
  • [9] DDaNet: Dual-Path Depth-Aware Attention Network for Fingerspelling Recognition Using RGB-D Images
    Yang, Shih-Hung
    Chen, Wei-Ren
    Huang, Wun-Jhu
    Chen, Yon-Ping
    IEEE ACCESS, 2021, 9 (09): : 7306 - 7322
  • [10] Multi-modal deep network for RGB-D segmentation of clothes
    Joukovsky, B.
    Hu, P.
    Munteanu, A.
    ELECTRONICS LETTERS, 2020, 56 (09) : 432 - 434