Adaptive RGB Image Recognition by Visual-Depth Embedding

被引:14
作者
Cai, Ziyun [1 ]
Long, Yang [2 ]
Shao, Ling [3 ,4 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing, Jiangsu, Peoples R China
[2] Newcastle Univ, Sch Comp, Open Lab, Newcastle Upon Tyne NE4 5TG, Tyne & Wear, England
[3] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[4] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
关键词
RGB-D data; domain adaptation; visual categorization; NONNEGATIVE MATRIX FACTORIZATION; KERNEL;
D O I
10.1109/TIP.2018.2806839
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing RGB images from RGB-D data is a promising application, which significantly reduces the cost while can still retain high recognition rates. However, existing methods still suffer from the domain shifting problem due to conventional surveillance cameras and depth sensors are using different mechanisms. In this paper, we aim to simultaneously solve the above two challenges: 1) how to take advantage of the additional depth information in the source domain? 2) how to reduce the data distribution mismatch between the source and target domains? We propose a novel method called adaptive visual-depth embedding (aVDE), which learns the compact shared latent space between two representations of labeled RGB and depth modalities in the source domain first. Then the shared latent space can help the transfer of the depth information to the unlabeled target dataset. At last, aVDE models two separate learning strategies for domain adaptation (feature matching and instance reweighting) in a unified optimization problem, which matches features and reweights instances jointly across the shared latent space and the projected target domain for an adaptive classifier. We test our method on five pairs of data sets for object recognition and scene classification, the results of which demonstrates the effectiveness of our proposed method.
引用
收藏
页码:2471 / 2483
页数:13
相关论文
共 51 条
  • [41] Learning to Rank Using Privileged Information
    Sharmanska, Viktoriia
    Quadrianto, Novi
    Lampert, Christoph H.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 825 - 832
  • [42] Shen B, 2010, AAAI CONF ARTIF INTE, P575
  • [43] Depth-Aware Image Seam Carving
    Shen, Jianbing
    Wang, Dapeng
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (05) : 1453 - 1461
  • [44] Silberman N., 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), P601, DOI 10.1109/ICCVW.2011.6130298
  • [45] A new learning paradigm: Learning using privileged information
    Vapnik, Vladimir
    Vashist, Akshay
    [J]. NEURAL NETWORKS, 2009, 22 (5-6) : 544 - 557
  • [46] Wang T, 2016, MULTIMEDIA MODELING, P3
  • [47] Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics
    Wolf, Lior
    Hassner, Tal
    Taigman, Yaniv
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (10) : 1978 - 1990
  • [48] Yang Y, 2011, IJCAI INT JOINT C AR
  • [49] Fast Orthogonal Projection Based on Kronecker Product
    Zhang, Xu
    Yu, Felix X.
    Guo, Ruiqi
    Kumar, Sanjiv
    Wang, Shengjin
    Chang, Shih-Fu
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2929 - 2937
  • [50] Zheng WB, 2011, LECT NOTES ARTIF INT, V7004, P505, DOI 10.1007/978-3-642-23896-3_62