Learning Rich Features from RGB-D Images for Object Detection and Segmentation

被引：994

作者：

Gupta, Saurabh ^{[1
]}

Girshick, Ross ^{[1
]}

Arbelaez, Pablo ^{[2
]}

Malik, Jitendra ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Univ Ios Andes, Bogota, Colombia

来源：

COMPUTER VISION - ECCV 2014, PT VII | 2014年 / 8695卷

关键词：

RGB-D perception; object detection; object segmentation;

D O I：

10.1007/978-3-319-10584-0_23

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.

引用

页码：345 / 360

页数：16

共 50 条

[21] Visual Saliency Detection for RGB-D Images with Generative Model
Wang, Song-Tao
Zhou, Zhen
Qu, Han-Bing
Li, Bin
COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 20 - 35
[22] Object Detection-Based One-Shot Imitation Learning with an RGB-D Camera
Shao, Quanquan
Qi, Jin
Ma, Jin
Fang, Yi
Wang, Weiming
Hu, Jie
APPLIED SCIENCES-BASEL, 2020, 10 (03):
[23] Self-Supervised Pretraining With Multimodality Representation Enhancement for Salient Object Detection in RGB-D Images
Gao, Lina
Liu, Bing
Fu, Ping
Xu, Mingzhu
Zhang, Yonggang
Huang, Yulong
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
[24] Learning Implicit Class Knowledge for RGB-D Co-Salient Object Detection With Transformers
Zhang, Ni
Han, Junwei
Liu, Nian
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4556 - 4570
[25] CDNet: Complementary Depth Network for RGB-D Salient Object Detection
Jin, Wen-Da
Xu, Jun
Han, Qi
Zhang, Yi
Cheng, Ming-Ming
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3376 - 3390
[26] RGB-D Point Cloud Registration Based on Salient Object Detection
Wan, Teng
Du, Shaoyi
Cui, Wenting
Yao, Runzhao
Ge, Yuyan
Li, Ce
Gao, Yue
Zheng, Nanning
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3547 - 3559
[27] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
Li, Gongyang
Liu, Zhi
Chen, Minyu
Bai, Zhen
Lin, Weisi
Ling, Haibin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3528 - 3542
[28] Adaptive Depth Enhancement Network for RGB-D Salient Object Detection
Yi, Kang
Li, Yumeng
Tang, Haoran
Xu, Jing
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 176 - 180
[29] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
Zhang, Qiang
Qin, Qi
Yang, Yang
Jiao, Qiang
Han, Jungong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
[30] Semantic parsing for priming object detection in indoors RGB-D scenes
Cadena, Cesar
Kosecka, Jana
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (4-5) : 582 - 597

← 1 2 3 4 5 →