VIEW-INVARIANT ACTION RECOGNITION FROM RGB DATA VIA 3D POSE ESTIMATION

被引:0
|
作者
Baptista, Renato [1 ]
Ghorbel, Enjie [1 ]
Papadopoulos, Konstantinos [1 ]
Demisse, Girum G. [1 ]
Aouada, Djamila [1 ]
Ottersten, Bjorn [1 ]
机构
[1] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 29 Ave JF Kennedy, L-1855 Luxembourg, Luxembourg
基金
欧盟地平线“2020”;
关键词
Pose Estimation; Skeleton; View-Invariance; LSTM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a novel view-invariant action recognition method using a single monocular RGB camera. View invariance remains a very challenging topic in 2D action recognition due to the lack of 3D information in RGB images. Most successful approaches make use of the concept of knowledge transfer by projecting 3D synthetic data to multiple viewpoints. Instead of relying on knowledge transfer, we propose to augment the RGB data by a third dimension by means of 3D skeleton estimation from 2D images using a CNN-based pose estimator. In order to ensure view invariance, a pre-processing for alignment is applied followed by data expansion as a way for denoising. Finally, a Long Short Term Memory (LSTM) architecture is used to model the temporal dependency between skeletons. The proposed network is trained to directly recognize actions from aligned 3D skeletons. The experiments performed on the challenging Northwestern-UCLA dataset show the superiority of our approach as compared to state-of-the-art ones.
引用
收藏
页码:2542 / 2546
页数:5
相关论文
共 50 条
  • [41] Temporal 3D Human Pose Estimation for Action Recognition from Arbitrary Viewpoints
    Musallam, Mohamed Adel
    Baptista, Renato
    Al Ismaeil, Kassem
    Aouada, Djamila
    2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 253 - 258
  • [42] Gesture Recognition Based on 3D Human Pose Estimation and Body Part Segmentation for RGB Data Input
    Nguyen, Ngoc-Hoang
    Phan, Tran-Dac-Thinh
    Lee, Guee-Sang
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    APPLIED SCIENCES-BASEL, 2020, 10 (18):
  • [43] Attention Transfer (ANT) Network for View-invariant Action Recognition
    Ji, Yanli
    Xu, Feixiang
    Yang, Yang
    Xie, Ning
    Shen, Heng Tao
    Harada, Tatsuya
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 574 - 582
  • [44] Fully Automatic Pose-Invariant Face Recognition via 3D Pose Normalization
    Asthana, Akshay
    Marks, Tim K.
    Jones, Michael J.
    Tieu, Kinh H.
    Rohith, M., V
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 937 - 944
  • [45] View-invariant human action recognition via robust locally adaptive multi-view learning
    Feng, Jia-geng
    Xiao, Jun
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (11) : 917 - 929
  • [46] Deeply Learned View-Invariant Features for Cross-View Action Recognition
    Kong, Yu
    Ding, Zhengming
    Li, Jun
    Fu, Yun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (06) : 3028 - 3037
  • [47] View-Invariant Action Recognition Based on Artificial Neural Networks
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (03) : 412 - 424
  • [48] View-Invariant Center-of-Pressure Metrics Estimation With Monocular RGB Camera
    Du, Chen
    Graham, Sarah
    Depp, Colin
    Nguyen, Truong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7388 - 7401
  • [49] Learning View-invariant Sparse Representations for Cross-view Action Recognition
    Zheng, Jingjing
    Jiang, Zhuolin
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3176 - 3183
  • [50] Hierarchically Learned View-Invariant Representations for Cross-View Action Recognition
    Liu, Yang
    Lu, Zhaoyang
    Li, Jing
    Yang, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (08) : 2416 - 2430