MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition

被引：12

作者：

Zhou, Feng ^{[1
]}

Hu, Yong ^{[2
]}

Shen, Xukun ^{[1
]}

机构：

[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China

[2] Beihang Univ, Sch New Media Art & Design, Beijing, Peoples R China

来源：

VISUAL COMPUTER | 2019年 / 35卷 / 11期

关键词：

Deep learning; Object recognition; Adversarial network; Multimodal;

D O I：

10.1007/s00371-018-1559-x

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This paper researches on the problem of object recognition using RGB-D data. Although deep convolutional neural networks have so far made progress in this area, they are still suffering a lot from lack of large-scale manually labeled RGB-D data. Labeling large-scale RGB-D dataset is a time-consuming and boring task. More importantly, such large-scale datasets often exist a long tail, and those hard positive examples of the tail can hardly be recognized. To solve these problems, we propose a multimodal self-augmentation and adversarial network (MSANet) for RGB-D object recognition, which can augment the data effectively at two levels while keeping the annotations. Toward the first level, series of transformations are leveraged to generate class-agnostic examples for each instance, which supports the training of our MSANet. Toward the second level, an adversarial network is proposed to generate class-specific hard positive examples while learning to classify them correctly to further improve the performance of our MSANet. Via the above schemes, the proposed approach wins the best results on several available RGB-D object recognition datasets, e.g., our experimental results indicate a 1.5% accuracy boost on benchmark Washington RGB-D object dataset compared with the current state of the art.

引用

页码：1583 / 1594

页数：12

共 50 条

[1] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
Feng Zhou
Yong Hu
Xukun Shen
The Visual Computer, 2019, 35 : 1583 - 1594
[2] A PCA-CCA network for RGB-D object recognition
Sun, Shiying
An, Ning
Zhao, Xiaoguang
Tan, Min
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2018, 15 (01):
[3] RGB-D OBJECT RECOGNITION WITH MULTIMODAL DEEP CONVOLUTIONAL NEURAL NETWORKS
Rahman, Mohammad Muntasir
Tan, Yanhao
Xue, Jian
Lu, Ke
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 991 - 996
[4] Learning Coupled Classifiers with RGB images for RGB-D object recognition
Li, Xiao
Fang, Min
Zhang, Ju-Jie
Wu, Jinqiao
PATTERN RECOGNITION, 2017, 61 : 433 - 446
[5] Object Recognition and Augmentation for Wearable-Assistive System Using Egocentric RGB-D Sensor
Gao, Ge
Qian, Kun
Ma, Xudong
Xia, Jing
Yu, Hai
2017 IEEE 7TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (CYBER), 2017, : 775 - 780
[6] Dynamic Selective Network for RGB-D Salient Object Detection
Wen, Hongfa
Yan, Chenggang
Zhou, Xiaofei
Cong, Runmin
Sun, Yaoqi
Zheng, Bolun
Zhang, Jiyong
Bao, Yongjun
Ding, Guiguang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9179 - 9192
[7] Application of Transfer Learning in RGB-D Object Recognition
Kumar, Abhishek
Shrivatsav, S. Nithin
Subrahmanyam, G. R. K. S.
Mishra, Deepak
2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 580 - 584
[8] An Object Recognition Method using RGB-D Sensor
Maeda, Daisuke
Morimoto, Masakazu
2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 857 - 861
[9] Object Class and Instance Recognition on RGB-D Data
Seib, Viktor
Christ-Friedmann, Susanne
Thierfelder, Susanne
Paulus, Dietrich
SIXTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2013), 2013, 9067
[10] Deep sensorimotor learning for RGB-D object recognition
Thermos, Spyridon
Papadopoulos, Georgios Th.
Daras, Petros
Potamianos, Gerasimos
COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 190

← 1 2 3 4 5 →