Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition

被引:56
作者
Zang, Jinliang [1 ]
Wang, Le [1 ]
Liu, Ziyi [1 ]
Zhang, Qilin [2 ]
Niu, Zhenxing
Hua, Gang [3 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Xian 710049, Shaanxi, Peoples R China
[2] HERE Technol, Chicago, IL 60606 USA
[3] Microsoft Res, Redmond, WA 98052 USA
来源
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018 | 2018年 / 519卷
基金
中国博士后科学基金;
关键词
Action recognition; Attention model; Convolutional neural networks; Video-level prediction; Temporal weighting;
D O I
10.1007/978-3-319-92007-8_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-based Temporal Weighted CNN (ATW), which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW frame- work is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experiments show that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.
引用
收藏
页码:97 / 108
页数:12
相关论文
共 50 条
[41]   Dynamic Korean Sign Language Recognition Using Pose Estimation Based and Attention-Based Neural Network [J].
Shin, Jungpil ;
Miah, Abu Saleh Musa ;
Suzuki, Kota ;
Hirooka, Koki ;
Hasan, Md. Al Mehedi .
IEEE ACCESS, 2023, 11 :143501-143513
[42]   Microseismic Event Recognition and Transfer Learning Based on Convolutional Neural Network and Attention Mechanisms [J].
Jin, Shu ;
Zhang, Shichao ;
Gao, Ya ;
Yu, Benli ;
Zhen, Shenglai .
APPLIED GEOPHYSICS, 2024,
[43]   Recognition of Teachers' Facial Expression Intensity Based on Convolutional Neural Network and Attention Mechanism [J].
Zheng, Kun ;
Yang, Dong ;
Liu, Junhua ;
Cui, Jinling .
IEEE ACCESS, 2020, 8 :226437-226444
[44]   Chicken Image Segmentation via Multi-Scale Attention-Based Deep Convolutional Neural Network [J].
Li, Wei ;
Xiao, Yang ;
Song, Xibin ;
Lv, Na ;
Jiang, Xinbo ;
Huang, Yan ;
Peng, Jingliang .
IEEE ACCESS, 2021, 9 :61398-61407
[45]   Convolutional Self-attention Guided Graph Neural Network for Few-Shot Action Recognition [J].
Pan, Fei ;
Guo, Jie ;
Guo, Yanwen .
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 :401-412
[46]   Attention-Based Deep Neural Network Combined Local and Global Features for Indoor Scene Recognition [J].
Chen, Luefeng ;
Duan, Wenhao ;
Li, Jiazhuo ;
Wu, Min ;
Pedrycz, Witold ;
Hirota, Kaoru .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (11) :12684-12693
[47]   Spatial Graph Convolutional and Temporal Involution Network for Skeleton-based Action Recognition [J].
Wan, Huifan ;
Pan, Guanghui ;
Chen, Yu ;
Ding, Danni ;
Zou, Maoyang .
PROCEEDINGS OF ACM TURING AWARD CELEBRATION CONFERENCE, ACM TURC 2021, 2021, :204-209
[48]   Temporal Receptive Field Graph Convolutional Network for Skeleton-based Action Recognition [J].
Zhang, Qingqi ;
Wu, Ren ;
Nakata, Mitsuru ;
Ge, Qi-Wei .
2024 INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS, AND COMMUNICATIONS, ITC-CSCC 2024, 2024,
[49]   DTA:Double LSTM with Temporal-wise Attention Network for Action Recognition [J].
Xu, Yangyang ;
Wang, Lei ;
Cheng, Jun ;
Xia, Haiying ;
Yin, Jianqin .
PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, :1676-1680
[50]   Human Action Recognition by Fusion of Convolutional Neural Networks and spatial-temporal Information [J].
Li, Weisheng ;
Ding, Yahui .
8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, :255-259