Collecting public RGB-D datasets for human daily activity recognition

被引：7

作者：

Wu, Hanbo ^{[1
]}

Ma, Xin ^{[1
]}

Zhang, Zhimeng ^{[1
]}

Wang, Haibo ^{[1
]}

Li, Yibin ^{[1
]}

机构：

[1] Shandong Univ, Sch Control Sci & Engn, 17923 Jingshi Rd, Jinan, Shandong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS | 2017年 / 14卷 / 04期

关键词：

Human daily activity recognition; public RGB-D data sets merging; large-scale RGB-D activity data set; depth motion maps; depth cuboid similarity feature; curvature space scale; OBJECT RECOGNITION; FUSION; MODEL;

D O I：

10.1177/1729881417709079

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Human daily activity recognition has been a hot spot in the field of computer vision for many decades. Despite best efforts, activity recognition in naturally uncontrolled settings remains a challenging problem. Recently, by being able to perceive depth and visual cues simultaneously, RGB-D cameras greatly boost the performance of activity recognition. However, due to some practical difficulties, the publicly available RGB-D data sets are not sufficiently large for benchmarking when considering the diversity of their activities, subjects, and background. This severely affects the applicability of complicated learning-based recognition approaches. To address the issue, this article provides a large-scale RGB-D activity data set by merging five public RGB-D data sets that differ from each other on many aspects such as length of actions, nationality of subjects, or camera angles. This data set comprises 4528 samples depicting 7 action categories (up to 46 subcategories) performed by 74 subjects. To verify the challengeness of the data set, three feature representation methods are evaluated, which are depth motion maps, spatiotemporal depth cuboid similarity feature, and curvature space scale. Results show that the merged large-scale data set is more realistic and challenging and therefore more suitable for benchmarking.

引用

页码：1 / 12

页数：12

共 50 条

[31] Robust Object Recognition Under Partial Occlusions Using an RGB-D Camera
Yoo, Yong-Ho
Kim, Jong-Hwan
ROBOT INTELLIGENCE TECHNOLOGY ANDAPPLICATIONS 3, 2015, 345 : 647 - 654
[32] Revisiting Deep Convolutional Neural Networks for RGB-D Based Object Recognition
Madai-Tahy, Lorand
Otte, Sebastian
Hanten, Richard
Zell, Andreas
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 29 - 37
[33] Integrating Complementary Appearance, Posture and Motion Cues for RGB-D Action Recognition
Chen, Wanjun
Zhang, Erhu
Zhang, Yan
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (08)
[34] Semi-supervised learning and feature evaluation for RGB-D object recognition
Cheng, Yanhua
Zhao, Xin
Huang, Kaiqi
Tan, Tieniu
COMPUTER VISION AND IMAGE UNDERSTANDING, 2015, 139 : 149 - 160
[35] Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition
Tang, Yansong
Wang, Zian
Lu, Jiwen
Feng, Jianjiang
Zhou, Jie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3001 - 3015
[36] Salient Feature Point Selection for Real Time RGB-D Hand Gesture Recognition
He, Yiwen
Yang, Jianyu
Shao, Zhanpeng
Li, Youfu
2017 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (RCAR), 2017, : 103 - 108
[37] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
Zhou, Feng
Hu, Yong
Shen, Xukun
VISUAL COMPUTER, 2019, 35 (11): : 1583 - 1594
[38] RGB-D object recognition based on the joint deep random kernel convolution and ELM
Yin, Yunhua
Li, Huifang
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 11 (11) : 4337 - 4346
[39] RGB-D object recognition based on the joint deep random kernel convolution and ELM
Yunhua Yin
Huifang Li
Journal of Ambient Intelligence and Humanized Computing, 2020, 11 : 4337 - 4346
[40] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
Feng Zhou
Yong Hu
Xukun Shen
The Visual Computer, 2019, 35 : 1583 - 1594

← 1 2 3 4 5 →