Temporal-Channel Topology Enhanced Network for Skeleton-Based Action Recognition

被引:0
作者
Luo, Jinzhao [1 ,2 ]
Zhou, Lu [1 ,2 ]
Zhu, Guibo [1 ,2 ,3 ]
Ge, Guojing [1 ]
Yang, Beiying [1 ,2 ]
Wang, Jinqiao [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, Fdn Model Res Ctr, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Wuhan AI Res, Wuhan 430073, Peoples R China
[4] Peng Cheng Lab, Shenzhen 518066, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I | 2024年 / 14425卷
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
human skeleton; action recognition; topology modeling;
D O I
10.1007/978-981-99-8429-9_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition has become popular in recent years due to its efficiency and robustness. Most current methods adopt graph convolutional network (GCN) for topology modeling, but GCN-based methods are limited in long-distance correlation modeling and generalizability. In contrast, the potential of convolutional neural network (CNN) for topology modeling has not been fully explored. In this paper, we propose a novel CNN architecture, Temporal-Channel Topology Enhanced Network (TCTE-Net), to learn spatial and temporal topologies for skeleton-based action recognition. The TCTE-Net consists of two modules: the Temporal-Channel Focus module, which learns a temporal-channel focus matrix to identify the most important feature representations, and the Dynamic Channel Topology Attention module, which dynamically learns spatial topological features, and fuses them with an attention mechanism to model long-distance channel-wise topology. We conduct experiments on NTU RGB+D, NTU RGB+D 120, and FineGym datasets. TCTE-Net shows state-of-the-art performance compared to CNN-based methods and achieves superior performance compared to GCN-based methods. The code is available at https://github. com/aikuniverse/TCTE-Net.
引用
收藏
页码:109 / 119
页数:11
相关论文
共 50 条
  • [31] Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates
    Liu, Jun
    Shahroudy, Amir
    Xu, Dong
    Kot, Alex C.
    Wang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) : 3007 - 3021
  • [32] Multi-scale spatial–temporal convolutional neural network for skeleton-based action recognition
    Qin Cheng
    Jun Cheng
    Ziliang Ren
    Qieshi Zhang
    Jianming Liu
    Pattern Analysis and Applications, 2023, 26 (3) : 1303 - 1315
  • [33] Spatio-temporal stacking model for skeleton-based action recognition
    Yufeng Zhong
    Qiuyan Yan
    Applied Intelligence, 2022, 52 : 12116 - 12130
  • [34] Spatio-temporal segments attention for skeleton-based action recognition
    Qiu, Helei
    Hou, Biao
    Ren, Bo
    Zhang, Xiaohua
    NEUROCOMPUTING, 2023, 518 : 30 - 38
  • [35] Fast Temporal Graph Convolutional Model for Skeleton-Based Action Recognition
    Nan, Mihai
    Florea, Adina Magda
    SENSORS, 2022, 22 (19)
  • [36] Glimpse and Zoom: Spatio-Temporal Focused Dynamic Network for Skeleton-Based Action Recognition
    Zhao, Zhifu
    Chen, Ziwei
    Li, Jianan
    Wang, Xiaotian
    Xie, Xuemei
    Huang, Lei
    Zhang, Wanxin
    Shi, Guangming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5616 - 5629
  • [37] Temporal segment graph convolutional networks for skeleton-based action recognition
    Ding, Chongyang
    Wen, Shan
    Ding, Wenwen
    Liu, Kai
    Belyaev, Evgeny
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [38] Mixed graph convolution and residual transformation network for skeleton-based action recognition
    Liu, Shuhua
    Bai, Xiaoying
    Fang, Ming
    Li, Lanting
    Hung, Chih-Cheng
    APPLIED INTELLIGENCE, 2022, 52 (02) : 1544 - 1555
  • [39] Spatio-temporal stacking model for skeleton-based action recognition
    Zhong, Yufeng
    Yan, Qiuyan
    APPLIED INTELLIGENCE, 2022, 52 (11) : 12116 - 12130
  • [40] Skeleton-based action recognition via spatial and temporal transformer networks
    Plizzari, Chiara
    Cannici, Marco
    Matteucci, Matteo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 208 (208-209)