VIEW-INVARIANT ACTION RECOGNITION FROM RGB DATA VIA 3D POSE ESTIMATION

被引:0
|
作者
Baptista, Renato [1 ]
Ghorbel, Enjie [1 ]
Papadopoulos, Konstantinos [1 ]
Demisse, Girum G. [1 ]
Aouada, Djamila [1 ]
Ottersten, Bjorn [1 ]
机构
[1] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 29 Ave JF Kennedy, L-1855 Luxembourg, Luxembourg
基金
欧盟地平线“2020”;
关键词
Pose Estimation; Skeleton; View-Invariance; LSTM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View invariance remains a very challenging topic in 2D action recognition due to the lack of 3D information in RGB images. Most successful approaches make use of the concept of knowledge transfer by projecting 3D synthetic data to multiple viewpoints. Instead of relying on knowledge transfer, we propose to augment the RGB data by a third dimension by means of 3D skeleton estimation from 2D images using a CNN-based pose estimator. In order to ensure view invariance, a pre-processing for alignment is applied followed by data expansion as a way for denoising. Finally, a Long Short Term Memory (LSTM) architecture is used to model the temporal dependency between skeletons. The proposed network is trained to directly recognize actions from aligned 3D skeletons. The experiments performed on the challenging Northwestern-UCLA dataset show the superiority of our approach as compared to state-of-the-art ones.
引用
收藏
页码:2542 / 2546
页数:5
相关论文
共 50 条
  • [31] A View-invariant Framework for Fast Skeleton-based Action Recognition using a Single RGB Camera
    Ghorbel, Enjie
    Papadopoulos, Konstantinos
    Baptista, Renato
    Pathak, Himadri
    Demisse, Girum
    Aouada, Djamila
    Ottersten, Bjoern
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 573 - 582
  • [32] Pose-Invariant Face Recognition via RGB-D Images
    Sang, Gaoli
    Li, Jing
    Zhao, Qijun
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [33] Dual-attention Network for View-invariant Action Recognition
    Gedamu Alemu Kumie
    Maregu Assefa Habtie
    Tewodros Alemu Ayall
    Changjun Zhou
    Huawen Liu
    Abegaz Mohammed Seid
    Aiman Erbad
    Complex & Intelligent Systems, 2024, 10 : 305 - 321
  • [34] HPERL: 3D Human Pose Estimation from RGB and LiDAR
    Fuerst, Michael
    Gupta, Shriya T. P.
    Schuster, Rene
    Wasenmueller, Oliver
    Stricker, Didier
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7321 - 7327
  • [35] Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images
    Tsai, Chi-Yi
    Tsai, Shu-Hsiang
    IEEE ACCESS, 2018, 6 : 28859 - 28869
  • [36] 3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data
    Yuan, Shanxin
    Stenger, Bjorn
    Kim, Tae-Kyun
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2866 - 2873
  • [37] Dual-attention Network for View-invariant Action Recognition
    Kumie, Gedamu Alemu
    Habtie, Maregu Assefa
    Ayall, Tewodros Alemu
    Zhou, Changjun
    Liu, Huawen
    Seid, Abegaz Mohammed
    Erbad, Aiman
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 305 - 321
  • [38] Dual-view 3D human pose estimation without camera parameters for action recognition
    Liu, Long
    Yang, Le
    Chen, Wanjun
    Gao, Xin
    IET IMAGE PROCESSING, 2021, 15 (14) : 3433 - 3440
  • [39] View-invariant gesture recognition using 3D optical flow and harmonic motion context
    Holte, M. B.
    Moeslund, T. B.
    Fihl, P.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (12) : 1353 - 1361
  • [40] View-invariant human action recognition via robust locally adaptive multi-view learning
    Jia-geng Feng
    Jun Xiao
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 917 - 929