Revisiting Skeleton-based Action Recognition

被引：368

作者：

Duan, Haodong ^{[1
,3
]}

Zhao, Yue ^{[2
]}

Chen, Kai ^{[3
,5
]}

Lin, Dahua ^{[1
,3
]}

Dai, Bo ^{[3
,4
]}

机构：

[1] Chinese Univ HongKong, Hong Kong, Peoples R China

[2] Univ Texas Austin, Austin, TX 78712 USA

[3] Shanghai AI Lab, Shanghai, Peoples R China

[4] Nanyang Technol Univ, S Lab, Singapore, Singapore

[5] SenseTime Res, Shenzhen, Peoples R China

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00298

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human skeleton, as a compact representation of human action, has received increasing attention in recent years. Many skeleton-based action recognition methods adopt GCNs to extract features on top of human skeletons. Despite the positive results shown in these attempts, GCN-based methods are subject to limitations in robustness, interoperability, and scalability. In this work, we propose PoseConv3D, a new approach to skeleton-based action recognition. PoseConv3D relies on a 3D heatmap volume instead of a graph sequence as the base representation of human skeletons. Compared to GCN-based methods, PoseConv3D is more effective in learning spatiotemporal features, more robust against pose estimation noises, and generalizes better in cross-dataset settings. Also, PoseConv3D can handle multiple-person scenarios without additional computation costs. The hierarchical features can be easily integrated with other modalities at early fusion stages, providing a great design space to boost the performance. PoseConv3D achieves the state-of-the-art on five of six standard skeleton-based action recognition benchmarks. Once fused with other modalities, it achieves the state-of-the-art on all eight multi-modality action recognition benchmarks. Code has been made available at: https://github.com/kennymckormick/pyskl.

引用

页码：2959 / 2968

页数：10

共 66 条

[1]

[Anonymous], 2020, CVPR, DOI DOI 10.1109/CVPR42600.2020.00269

[2]

[Anonymous], 2016, 2016 IEEE C COMPUTER, DOI [10.1109/CVPR.2016.115, DOI 10.1109/CVPR.2016.115]

[3]

[Anonymous], 2019, AVSS

[4]

[Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00734

[5]

[Anonymous], 2015, ICCV

[6]

[Anonymous], 2021, CVPR, DOI DOI 10.1109/TSMC.2019.2958072

[7]

[Anonymous], 2017, NeurIPS

[8]

Asghari-Esfeden S, 2020, IEEE WINT CONF APPL, P546, DOI 10.1109/WACV45572.2020.9093500

[9]

Cai J., 2021, WACV, P2735

[10] OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields [J].

Cao, Zhe ;

Hidalgo, Gines ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (01) :172-186

← 1 2 3 4 5 6 7 →