Beyond coordinate attention: spatial-temporal recalibration and channel scaling for skeleton-based action recognition

被引:2
|
作者
Tang, Jun [1 ,2 ]
Gong, Sihang [1 ]
Wang, Yanjiang [1 ]
Liu, Baodi [1 ]
Du, Chunyu [1 ]
Gu, Boyang [1 ]
机构
[1] China Univ Petr East China, Coll Control Sci & Engn, Qingdao 266580, Peoples R China
[2] Qingdao Agr Univ, Coll Animat & Commun, Qingdao 266109, Peoples R China
基金
中国国家自然科学基金;
关键词
Lightweight attention mechanism; Long-range dependency; Graph convolutional network; Skeleton-based action recognition; Object detection; Semantic segmentation;
D O I
10.1007/s11760-023-02747-0
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Skeleton-based action recognition is an attractive issue in computer vision. Recent lightweight attention mechanisms (e.g. coordinate attention) have proven to be highly effective in skeleton-based action recognition. However, since long-range dependencies are captured along spatial and temporal directions, respectively, coordination attention cannot capture accurate long-range dependencies in the entire spatio-temporal domain and inevitably leads to inaccurate spatio-temporal location. In this work, we propose an efficient and lightweight attention mechanism, called coordinate enhanced attention, which consists of spatial-temporal recalibration and channel scaling. Spatial-temporal recalibration aims to capture precise long-range dependencies directly in the entire spatial-temporal domain. And channel scaling is introduced to efficiently utilize the multi-channel weight information. Our coordinate enhanced attention is efficient and lightweight, which can be easily integrated into classical neural networks. On two large-size datasets for skeleton-based action recognition (i.e. NTU RGB+D 60 and NTU RGB+D 120), our coordinate enhanced attention obtains consistent improvements. Experiments on two popular object detection datasets (i.e. COCO and Pascal VOC) and semantic segmentation dataset (i.e. Cityscapes) indicate that the proposed coordinate enhanced attention outperforms other lightweight attention mechanisms, which further validates its transferable ability.
引用
收藏
页码:199 / 206
页数:8
相关论文
共 50 条
  • [1] Beyond coordinate attention: spatial-temporal recalibration and channel scaling for skeleton-based action recognition
    Jun Tang
    Sihang Gong
    Yanjiang Wang
    Baodi Liu
    Chunyu Du
    Boyang Gu
    Signal, Image and Video Processing, 2024, 18 : 199 - 206
  • [2] Spatial-temporal graph attention networks for skeleton-based action recognition
    Huang, Qingqing
    Zhou, Fengyu
    He, Jiakai
    Zhao, Yang
    Qin, Runze
    JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (05)
  • [3] Spatial-Temporal gated graph attention network for skeleton-based action recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 929 - 939
  • [4] Spatial-Temporal Dynamic Graph Attention Network for Skeleton-Based Action Recognition
    Rahevar, Mrugendrasinh
    Ganatra, Amit
    Saba, Tanzila
    Rehman, Amjad
    Bahaj, Saeed Ali
    IEEE ACCESS, 2023, 11 : 21546 - 21553
  • [5] Skeleton-based attention-aware spatial-temporal model for action detection and recognition
    Cui, Ran
    Zhu, Aichun
    Wu, Jingran
    Hua, Gang
    IET COMPUTER VISION, 2020, 14 (05) : 177 - 184
  • [6] TranSkeleton: Hierarchical Spatial-Temporal Transformer for Skeleton-Based Action Recognition
    Liu, Haowei
    Liu, Yongcheng
    Chen, Yuxin
    Yuan, Chunfeng
    Li, Bing
    Hu, Weiming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (08) : 4137 - 4148
  • [7] A Spatial-Temporal Feature Fusion Strategy for Skeleton-Based Action Recognition
    Chen, Yitian
    Xu, Yuchen
    Xie, Qianglai
    Xiong, Lei
    Yao, Leiyue
    2023 INTERNATIONAL CONFERENCE ON DATA SECURITY AND PRIVACY PROTECTION, DSPP, 2023, : 207 - 215
  • [8] Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition
    Gao, Xuehao
    Yang, Yang
    Wu, Yang
    Du, Shaoyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12130 - 12141
  • [9] A Novel Spatial-Temporal Graph for Skeleton-based Driver Action Recognition
    Li, Peng
    Lu, Meiqi
    Zhang, Zhiwei
    Shan, Donghui
    Yang, Yang
    2019 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2019, : 3243 - 3248
  • [10] Skeleton-based action recognition with local dynamic spatial-temporal aggregation
    Hu, Lianyu
    Liu, Shenglan
    Feng, Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232