Collecting public RGB-D datasets for human daily activity recognition

被引:7
|
作者
Wu, Hanbo [1 ]
Ma, Xin [1 ]
Zhang, Zhimeng [1 ]
Wang, Haibo [1 ]
Li, Yibin [1 ]
机构
[1] Shandong Univ, Sch Control Sci & Engn, 17923 Jingshi Rd, Jinan, Shandong, Peoples R China
来源
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS | 2017年 / 14卷 / 04期
关键词
Human daily activity recognition; public RGB-D data sets merging; large-scale RGB-D activity data set; depth motion maps; depth cuboid similarity feature; curvature space scale; OBJECT RECOGNITION; FUSION; MODEL;
D O I
10.1177/1729881417709079
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Human daily activity recognition has been a hot spot in the field of computer vision for many decades. Despite best efforts, activity recognition in naturally uncontrolled settings remains a challenging problem. Recently, by being able to perceive depth and visual cues simultaneously, RGB-D cameras greatly boost the performance of activity recognition. However, due to some practical difficulties, the publicly available RGB-D data sets are not sufficiently large for benchmarking when considering the diversity of their activities, subjects, and background. This severely affects the applicability of complicated learning-based recognition approaches. To address the issue, this article provides a large-scale RGB-D activity data set by merging five public RGB-D data sets that differ from each other on many aspects such as length of actions, nationality of subjects, or camera angles. This data set comprises 4528 samples depicting 7 action categories (up to 46 subcategories) performed by 74 subjects. To verify the challengeness of the data set, three feature representation methods are evaluated, which are depth motion maps, spatiotemporal depth cuboid similarity feature, and curvature space scale. Results show that the merged large-scale data set is more realistic and challenging and therefore more suitable for benchmarking.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [31] Robust Object Recognition Under Partial Occlusions Using an RGB-D Camera
    Yoo, Yong-Ho
    Kim, Jong-Hwan
    ROBOT INTELLIGENCE TECHNOLOGY ANDAPPLICATIONS 3, 2015, 345 : 647 - 654
  • [32] Revisiting Deep Convolutional Neural Networks for RGB-D Based Object Recognition
    Madai-Tahy, Lorand
    Otte, Sebastian
    Hanten, Richard
    Zell, Andreas
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 29 - 37
  • [33] Integrating Complementary Appearance, Posture and Motion Cues for RGB-D Action Recognition
    Chen, Wanjun
    Zhang, Erhu
    Zhang, Yan
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (08)
  • [34] Semi-supervised learning and feature evaluation for RGB-D object recognition
    Cheng, Yanhua
    Zhao, Xin
    Huang, Kaiqi
    Tan, Tieniu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2015, 139 : 149 - 160
  • [35] Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition
    Tang, Yansong
    Wang, Zian
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3001 - 3015
  • [36] Salient Feature Point Selection for Real Time RGB-D Hand Gesture Recognition
    He, Yiwen
    Yang, Jianyu
    Shao, Zhanpeng
    Li, Youfu
    2017 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (RCAR), 2017, : 103 - 108
  • [37] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
    Zhou, Feng
    Hu, Yong
    Shen, Xukun
    VISUAL COMPUTER, 2019, 35 (11): : 1583 - 1594
  • [38] RGB-D object recognition based on the joint deep random kernel convolution and ELM
    Yin, Yunhua
    Li, Huifang
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 11 (11) : 4337 - 4346
  • [39] RGB-D object recognition based on the joint deep random kernel convolution and ELM
    Yunhua Yin
    Huifang Li
    Journal of Ambient Intelligence and Humanized Computing, 2020, 11 : 4337 - 4346
  • [40] MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition
    Feng Zhou
    Yong Hu
    Xukun Shen
    The Visual Computer, 2019, 35 : 1583 - 1594