Combine multi-order representation learning and frame optimization learning for skeleton-based action recognition

被引:0
|
作者
Nong, Liping [1 ,3 ,5 ]
Huang, Zhuocheng [2 ,4 ]
Wang, Junyi [1 ]
Rong, Yanpeng [2 ,4 ]
Peng, Jie [1 ]
Huang, Yiping [2 ,4 ]
机构
[1] Guilin Univ Elect Technol, Sch Informat & Commun, Guilin 541004, Peoples R China
[2] Guangxi Normal Univ, Sch Elect & Informat Engn, Guangxi Key Lab Brain inspired Comp & Intelligent, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Key Lab Cognit Radio & Informat Proc, Minist Educ, Guilin 541004, Peoples R China
[4] Guangxi Normal Univ, Educ Dept Guangxi Zhuang Autonomous Reg, Key Lab Integrated Circuits & Microsyst, Guilin 541004, Peoples R China
[5] Guangxi Normal Univ, Coll Phys & Technol, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Skeleton-based action recognition; Graph convolutional network; Hypergraph convolutional network; Frame optimization learning;
D O I
10.1016/j.dsp.2024.104823
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Skeleton-based action recognition has broad application prospects in many fields such as virtual reality. Currently, the most popular way is to employ Graph Convolutional Networks (GCNs) or Hypergraph Convolutional Networks (HGCNs) for this task. However, GCN-based methods may heavily rely on the physical connectivity relationship between joints while lack the capture of higher-order information about interactions among distant joints, and HGCN-based methods usually introduce unnecessary noise when capturing low-order information of skeleton structures with simple topology. Besides, the current methods do not deal well with redundant frames and confusing frames. These limitations hinder the improvement of recognition accuracy. In this paper, we propose a novel network, called Hyper-Net, which combines multi-order representation learning and frame optimization learning for skeleton-based action recognition. Specifically, the proposed Hyper-Net contains Temporal-Channel Aggregation Graph Convolution (TCA-GC), Spatial-Temporal Aggregation Hypergraph Convolution (STA-HC) and Frame Optimization Learning (F-OL) modules. The TCA-GC aggregates low-order and local information from simple joint and bone topologies across different temporal and channel dimensions. The STA-HC captures high- order and global information from complex motion streams as well as solving the problem of spatial-temporal weight imbalance. The F-OL can adaptively extract key frames and distinguish confusing frames, thus improving the ability of the network to recognize confusing actions. A large number of experiments are conducted on the NTU RGB+D, NTU RGB+D 120 and NW-UCLA datasets for action recognition task. Experimental results demonstrate the superiority and effectiveness of the proposed network.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] RELATIONAL NETWORK FOR SKELETON-BASED ACTION RECOGNITION
    Zheng, Wu
    Li, Lin
    Zhang, Zhaoxiang
    Huang, Yan
    Wang, Liang
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 826 - 831
  • [32] Action Jitter Killer: Joint Noise Optimization Cascade for Skeleton-Based Action Recognition
    Liu, Ruyi
    Liu, Yi
    Xin, Wentian
    Miao, Qiguang
    Li, Long
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
  • [33] SpatioTemporal focus for skeleton-based action recognition
    Wu, Liyu
    Zhang, Can
    Zou, Yuexian
    PATTERN RECOGNITION, 2023, 136
  • [34] Ghost Graph Convolutional Network for Skeleton-based Action Recognition
    Jang, Sungjun
    Lee, Heansung
    Cho, Suhwan
    Woo, Sungmin
    Lee, Sangyoun
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2021,
  • [35] Extremely Lightweight Skeleton-Based Action Recognition With ShiftGCN plus
    Cheng, Ke
    Zhang, Yifan
    He, Xiangyu
    Cheng, Jian
    Lu, Hanqing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7333 - 7348
  • [36] Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition
    Bian, Cunling
    Feng, Wei
    Wan, Liang
    Wang, Song
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2963 - 2976
  • [37] ACL-SAR: model agnostic adversarial contrastive learning for robust skeleton-based action recognition
    Zhu, Jiaxuan
    Shao, Ming
    Sun, Libo
    Xia, Siyu
    VISUAL COMPUTER, 2025, 41 (04): : 2495 - 2510
  • [38] Focalized contrastive view-invariant learning for self-supervised skeleton-based action recognition
    Men, Qianhui
    Ho, Edmond S. L.
    Shum, Hubert P. H.
    Leung, Howard
    NEUROCOMPUTING, 2023, 537 : 198 - 209
  • [39] Graph2Net: Perceptually-Enriched Graph Learning for Skeleton-Based Action Recognition
    Wu, Cong
    Wu, Xiao-Jun
    Kittler, Josef
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2120 - 2132
  • [40] LEARNING DISCRIMINATIVE AND ROBUST REPRESENTATIONS FOR UAV-VIEW SKELETON-BASED ACTION RECOGNITION<bold> </bold>
    Sun, Shaofan
    Zhang, Jiahang
    Tang, Guo
    Jia, Chuanmin
    Liu, Jiaying
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS, ICMEW 2024, 2024,