Revisiting Skeleton-based Action Recognition

被引:299
作者
Duan, Haodong [1 ,3 ]
Zhao, Yue [2 ]
Chen, Kai [3 ,5 ]
Lin, Dahua [1 ,3 ]
Dai, Bo [3 ,4 ]
机构
[1] Chinese Univ HongKong, Hong Kong, Peoples R China
[2] Univ Texas Austin, Austin, TX 78712 USA
[3] Shanghai AI Lab, Shanghai, Peoples R China
[4] Nanyang Technol Univ, S Lab, Singapore, Singapore
[5] SenseTime Res, Shenzhen, Peoples R China
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.00298
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human skeleton, as a compact representation of human action, has received increasing attention in recent years. Many skeleton-based action recognition methods adopt GCNs to extract features on top of human skeletons. Despite the positive results shown in these attempts, GCN-based methods are subject to limitations in robustness, interoperability, and scalability. In this work, we propose PoseConv3D, a new approach to skeleton-based action recognition. PoseConv3D relies on a 3D heatmap volume instead of a graph sequence as the base representation of human skeletons. Compared to GCN-based methods, PoseConv3D is more effective in learning spatiotemporal features, more robust against pose estimation noises, and generalizes better in cross-dataset settings. Also, PoseConv3D can handle multiple-person scenarios without additional computation costs. The hierarchical features can be easily integrated with other modalities at early fusion stages, providing a great design space to boost the performance. PoseConv3D achieves the state-of-the-art on five of six standard skeleton-based action recognition benchmarks. Once fused with other modalities, it achieves the state-of-the-art on all eight multi-modality action recognition benchmarks. Code has been made available at: https://github.com/kennymckormick/pyskl.
引用
收藏
页码:2959 / 2968
页数:10
相关论文
共 66 条
  • [1] [Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00269
  • [2] [Anonymous], 2016, CVPR, DOI [10.1109/CVPR.2016.115, DOI 10.1109/CVPR.2016.115]
  • [3] [Anonymous], 2019, AVSS
  • [4] [Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00734
  • [5] [Anonymous], 2015, ICCV
  • [6] [Anonymous], 2021, CVPR, DOI DOI 10.1109/TSMC.2019.2958072
  • [7] Asghari-Esfeden S, 2020, IEEE WINT CONF APPL, P546, DOI 10.1109/WACV45572.2020.9093500
  • [8] Cai J., 2021, WACV, P2735
  • [9] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
    Cao, Zhe
    Hidalgo, Gines
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) : 172 - 186
  • [10] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
    Cao, Zhe
    Simon, Tomas
    Wei, Shih-En
    Sheikh, Yaser
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1302 - 1310