Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition

被引:5
|
作者
Hu, Xiaodan [1 ]
Ahuja, Narendra [1 ]
机构
[1] Univ Illinois, Dept Elect & Comp Engn, Champaign, IL 61820 USA
基金
美国食品与农业研究所;
关键词
D O I
10.1109/ICCV48922.2021.01083
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dance experts often view dance as a hierarchy of information, spanning low-level (raw images, image sequences), mid-levels (human poses and bodypart movements), and high-level (dance genre). We propose a Hierarchical Dance Video Recognition framework (HDVR). HDVR estimates 2D pose sequences, tracks dancers, and then simultaneously estimates corresponding 3D poses and 3D-to-2D imaging parameters, without requiring ground truth for 3D poses. Unlike most methods that work on a single person, our tracking works on multiple dancers, under occlusions. From the estimated 3D pose sequence, HDVR extracts body part movements, and therefrom dance genre. The resulting hierarchical dance representation is explainable to experts. To overcome noise and interframe correspondence ambiguities, we enforce spatial and temporal motion smoothness and photometric continuity over time. We use an LSTM network to extract 3D movement subsequences from which we recognize dance genre. For experiments, we have identified 154 movement types, of 16 body parts, and assembled a new University of Illinois Dance (UID) Dataset, containing 1143 video clips of 9 genres covering 30 hours, annotated with movement and genre labels. Our experimental results demonstrate that our algorithms outperform the state-of-the-art 3D pose estimation methods, which also enhances our dance recognition performance.
引用
收藏
页码:10995 / 11004
页数:10
相关论文
共 50 条
  • [41] Capture of 3D Human Motion Pose in Virtual Reality Based on Video Recognition
    Fu, Qiang
    Zhang, Xingui
    Xu, Jinxiu
    Zhang, Haimin
    COMPLEXITY, 2020, 2020
  • [42] Efficient Multi-Person Hierarchical 3D Pose Estimation for Autonomous Driving
    Gu, Renshu
    Wang, Gaoang
    Hwang, Jenq-Neng
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 163 - 168
  • [43] Pseudo 3D Pose Recognition Network
    Xie, Yuanfeng
    Yu, Xiangyang
    Hong, Weibin
    Xin, Zhaolong
    Chen, Yanwen
    IEEE ACCESS, 2023, 11 : 56380 - 56391
  • [44] Simultaneous 3D Object Recognition and Pose Estimation Based on RGB-D Images
    Tsai, Chi-Yi
    Tsai, Shu-Hsiang
    IEEE ACCESS, 2018, 6 : 28859 - 28869
  • [45] Parallel-branch network for 3D human pose and shape estimation in video
    Wu, Yuanhao
    Wang, Chenxing
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
  • [46] Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video
    Zhou, Xiaowei
    Zhu, Menglong
    Leonardos, Spyridon
    Derpanis, Konstantinos G.
    Daniilidis, Kostas
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4966 - 4975
  • [47] Bidirectional temporal feature for 3D human pose and shape estimation from a video
    Sun, Libo
    Tang, Ting
    Qu, Yuke
    Qin, Wenhu
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
  • [48] Multi-Person Absolute 3D Pose and Shape Estimation from Video
    Zhang, Kaifu
    Li, Yihui
    Guan, Yisheng
    Xi, Ning
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2021, PT III, 2021, 13015 : 189 - 200
  • [49] Robust 3D Human Pose Estimation from Single Images or Video Sequences
    Wang, Chunyu
    Wang, Yizhou
    Lin, Zhouchen
    Yuille, Alan L.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (05) : 1227 - 1241
  • [50] Uncertainty-Aware 3D Human Pose Estimation from Monocular Video
    Zhang, Jinlu
    Chen, Yujin
    Tu, Zhigang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5102 - 5113