Multi-level channel attention excitation network for human action recognition in videos

被引:4
|
作者
Wu, Hanbo [1 ]
Ma, Xin [1 ]
Li, Yibin [1 ]
机构
[1] Shandong Univ, Ctr Robot, Sch Control Sci & Engn, Jinan, Peoples R China
基金
中国国家自然科学基金;
关键词
Human action recognition; 2D CNNs; Channel attention; Spatiotemporal modeling;
D O I
10.1016/j.image.2023.116940
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Channel attention mechanism has continuously attracted strong interests and shown great potential in enhancing the performance of deep CNNs. However, when applied to video-based human action recognition task, most existing methods generally learn channel attention at frame level, which ignores the temporal dependencies and may limit the recognition performance. In this paper, we propose a novel multi-level channel attention excitation (MCAE) module to model the temporal-related channel attention at both frame and video levels. Specifically, based on video convolutional feature maps, frame-level channel attention (FCA) is generated by exploring time-channel correlations, and video-level channel attention (VCA) is generated by aggregating global motion variations. MCAE firstly recalibrates video feature responses with frame-wise FCA, and then activates the motion-sensitive channel features with motion-aware VCA. MCAE module learns the channel discriminability from multiple levels and can act as a guidance to facilitate efficient spatiotemporal feature modeling in activated motion-sensitive channels. It can be flexibly embedded into 2D networks with very limited extra computation cost to construct MCAE-Net, which effectively enhances the spatiotemporal representation of 2D models for video action recognition task Extensive experiments on five human action datasets show that our method achieves superior or very competitive performance compared with the state -of-the-arts, which demonstrates the effectiveness of the proposed method for improving the performance of human action recognition.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Multi-Level Ensemble Network for Scene Recognition
    Zhang, Longhao
    Li, Lingqiao
    Pan, Xipeng
    Cao, Zhiwei
    Chen, Qianyu
    Yang, Huihua
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (19) : 28209 - 28230
  • [22] Multi-Level Ensemble Network for Scene Recognition
    Longhao Zhang
    Lingqiao Li
    Xipeng Pan
    Zhiwei Cao
    Qianyu Chen
    Huihua Yang
    Multimedia Tools and Applications, 2019, 78 : 28209 - 28230
  • [23] Learning multi-level features for sensor-based human action recognition
    Xu, Yan
    Shen, Zhengyang
    Zhang, Xin
    Gao, Yifan
    Deng, Shujian
    Wang, Yipei
    Fan, Yubo
    Chang, Eric I-Chao
    PERVASIVE AND MOBILE COMPUTING, 2017, 40 : 324 - 338
  • [24] Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos
    Du, Wenbin
    Wang, Yali
    Qiao, Yu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1347 - 1360
  • [25] Multi-level Residual Attention Network for Speckle Suppression
    Lei, Yu
    Liu, Shuaiqi
    Zhang, Luyao
    Zhao, Ling
    Zhao, Jie
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 288 - 299
  • [26] Multi-Level Attention Network for Retinal Vessel Segmentation
    Yuan, Yuchen
    Zhang, Lei
    Wang, Lituan
    Huang, Haiying
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (01) : 312 - 323
  • [27] Multi-level attention network: Mixed time-frequency channel attention and multi-scale self-attentive standard deviation pooling for speaker recognition
    Deng, Lihong
    Deng, Fei
    Zhou, Kepeng
    Jiang, Peifan
    Zhang, Gexiang
    Yang, Qiang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 128
  • [28] Multi-level Attention Fusion for Multimodal Driving Maneuver Recognition
    Liu, Jing
    Liu, Yang
    Tian, Chengwen
    Zhao, Mengyang
    Zeng, Xinhua
    Song, Liang
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 2609 - 2613
  • [29] Multi-level Stereo Attention Model for Center Channel Extraction
    Lim, Wootaek
    Beack, Seungkwon
    Lee, Taejin
    2019 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2019,
  • [30] MFDAN: Multi-Level Flow-Driven Attention Network for Micro-Expression Recognition
    Cai, Wenhao
    Zhao, Junli
    Yi, Ran
    Yu, Minjing
    Duan, Fuqing
    Pan, Zhenkuan
    Liu, Yong-Jin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 12823 - 12836