VIEW-INVARIANT ACTION RECOGNITION FROM RGB DATA VIA 3D POSE ESTIMATION

被引：0

作者：

Baptista, Renato ^{[1
]}

Ghorbel, Enjie ^{[1
]}

Papadopoulos, Konstantinos ^{[1
]}

Demisse, Girum G. ^{[1
]}

Aouada, Djamila ^{[1
]}

Ottersten, Bjorn ^{[1
]}

机构：

[1] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 29 Ave JF Kennedy, L-1855 Luxembourg, Luxembourg

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

基金：

欧盟地平线“2020”;

关键词：

Pose Estimation; Skeleton; View-Invariance; LSTM;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View invariance remains a very challenging topic in 2D action recognition due to the lack of 3D information in RGB images. Most successful approaches make use of the concept of knowledge transfer by projecting 3D synthetic data to multiple viewpoints. Instead of relying on knowledge transfer, we propose to augment the RGB data by a third dimension by means of 3D skeleton estimation from 2D images using a CNN-based pose estimator. In order to ensure view invariance, a pre-processing for alignment is applied followed by data expansion as a way for denoising. Finally, a Long Short Term Memory (LSTM) architecture is used to model the temporal dependency between skeletons. The proposed network is trained to directly recognize actions from aligned 3D skeletons. The experiments performed on the challenging Northwestern-UCLA dataset show the superiority of our approach as compared to state-of-the-art ones.

引用

页码：2542 / 2546

页数：5

共 50 条

[31] A View-invariant Framework for Fast Skeleton-based Action Recognition using a Single RGB Camera
Ghorbel, Enjie
Papadopoulos, Konstantinos
Baptista, Renato
Pathak, Himadri
Demisse, Girum
Aouada, Djamila
Ottersten, Bjoern
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 573 - 582
[32] Pose-Invariant Face Recognition via RGB-D Images
Sang, Gaoli
Li, Jing
Zhao, Qijun
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
[33] Dual-attention Network for View-invariant Action Recognition
Gedamu Alemu Kumie
Maregu Assefa Habtie
Tewodros Alemu Ayall
Changjun Zhou
Huawen Liu
Abegaz Mohammed Seid
Aiman Erbad
Complex & Intelligent Systems, 2024, 10 : 305 - 321
[34] HPERL: 3D Human Pose Estimation from RGB and LiDAR
Fuerst, Michael
Gupta, Shriya T. P.
Schuster, Rene
Wasenmueller, Oliver
Stricker, Didier
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7321 - 7327
[35] Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images
Tsai, Chi-Yi
Tsai, Shu-Hsiang
IEEE ACCESS, 2018, 6 : 28859 - 28869
[36] 3D Hand Pose Estimation from RGB Using Privileged Learning with Depth Data
Yuan, Shanxin
Stenger, Bjorn
Kim, Tae-Kyun
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2866 - 2873
[37] Dual-attention Network for View-invariant Action Recognition
Kumie, Gedamu Alemu
Habtie, Maregu Assefa
Ayall, Tewodros Alemu
Zhou, Changjun
Liu, Huawen
Seid, Abegaz Mohammed
Erbad, Aiman
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 305 - 321
[38] Dual-view 3D human pose estimation without camera parameters for action recognition
Liu, Long
Yang, Le
Chen, Wanjun
Gao, Xin
IET IMAGE PROCESSING, 2021, 15 (14) : 3433 - 3440
[39] View-invariant gesture recognition using 3D optical flow and harmonic motion context
Holte, M. B.
Moeslund, T. B.
Fihl, P.
COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (12) : 1353 - 1361
[40] View-invariant human action recognition via robust locally adaptive multi-view learning
Jia-geng Feng
Jun Xiao
Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 917 - 929

← 1 2 3 4 5 →