MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition

被引:12
|
作者
Zhou, Feng [1 ]
Hu, Yong [2 ]
Shen, Xukun [1 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[2] Beihang Univ, Sch New Media Art & Design, Beijing, Peoples R China
来源
VISUAL COMPUTER | 2019年 / 35卷 / 11期
关键词
Deep learning; Object recognition; Adversarial network; Multimodal;
D O I
10.1007/s00371-018-1559-x
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper researches on the problem of object recognition using RGB-D data. Although deep convolutional neural networks have so far made progress in this area, they are still suffering a lot from lack of large-scale manually labeled RGB-D data. Labeling large-scale RGB-D dataset is a time-consuming and boring task. More importantly, such large-scale datasets often exist a long tail, and those hard positive examples of the tail can hardly be recognized. To solve these problems, we propose a multimodal self-augmentation and adversarial network (MSANet) for RGB-D object recognition, which can augment the data effectively at two levels while keeping the annotations. Toward the first level, series of transformations are leveraged to generate class-agnostic examples for each instance, which supports the training of our MSANet. Toward the second level, an adversarial network is proposed to generate class-specific hard positive examples while learning to classify them correctly to further improve the performance of our MSANet. Via the above schemes, the proposed approach wins the best results on several available RGB-D object recognition datasets, e.g., our experimental results indicate a 1.5% accuracy boost on benchmark Washington RGB-D object dataset compared with the current state of the art.
引用
收藏
页码:1583 / 1594
页数:12
相关论文
共 50 条
  • [1] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
    Feng Zhou
    Yong Hu
    Xukun Shen
    The Visual Computer, 2019, 35 : 1583 - 1594
  • [2] A PCA-CCA network for RGB-D object recognition
    Sun, Shiying
    An, Ning
    Zhao, Xiaoguang
    Tan, Min
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2018, 15 (01):
  • [3] RGB-D OBJECT RECOGNITION WITH MULTIMODAL DEEP CONVOLUTIONAL NEURAL NETWORKS
    Rahman, Mohammad Muntasir
    Tan, Yanhao
    Xue, Jian
    Lu, Ke
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 991 - 996
  • [4] Learning Coupled Classifiers with RGB images for RGB-D object recognition
    Li, Xiao
    Fang, Min
    Zhang, Ju-Jie
    Wu, Jinqiao
    PATTERN RECOGNITION, 2017, 61 : 433 - 446
  • [5] Object Recognition and Augmentation for Wearable-Assistive System Using Egocentric RGB-D Sensor
    Gao, Ge
    Qian, Kun
    Ma, Xudong
    Xia, Jing
    Yu, Hai
    2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 775 - 780
  • [6] Dynamic Selective Network for RGB-D Salient Object Detection
    Wen, Hongfa
    Yan, Chenggang
    Zhou, Xiaofei
    Cong, Runmin
    Sun, Yaoqi
    Zheng, Bolun
    Zhang, Jiyong
    Bao, Yongjun
    Ding, Guiguang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9179 - 9192
  • [7] Application of Transfer Learning in RGB-D Object Recognition
    Kumar, Abhishek
    Shrivatsav, S. Nithin
    Subrahmanyam, G. R. K. S.
    Mishra, Deepak
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 580 - 584
  • [8] An Object Recognition Method using RGB-D Sensor
    Maeda, Daisuke
    Morimoto, Masakazu
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 857 - 861
  • [9] Object Class and Instance Recognition on RGB-D Data
    Seib, Viktor
    Christ-Friedmann, Susanne
    Thierfelder, Susanne
    Paulus, Dietrich
    SIXTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2013), 2013, 9067
  • [10] Deep sensorimotor learning for RGB-D object recognition
    Thermos, Spyridon
    Papadopoulos, Georgios Th.
    Daras, Petros
    Potamianos, Gerasimos
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 190