Learning Attention-Enhanced Spatiotemporal Representation for Action Recognition

被引：11

作者：

Shi, Zhensheng ^{[1
]}

Cao, Liangjie ^{[1
]}

Guan, Cheng ^{[1
]}

Zheng, Haiyong ^{[1
]}

Gu, Zhaorui ^{[1
]}

Yu, Zhibin ^{[1
]}

Zheng, Bing ^{[1
]}

机构：

[1] Ocean Univ China, Dept Elect Engn, Qingdao 266100, Peoples R China

来源：

IEEE ACCESS | 2020年 / 8卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Action recognition; video understanding; spatiotemporal representation; visual attention; 3D-CNN; residual learning;

D O I：

10.1109/ACCESS.2020.2968024

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Learning spatiotemporal features via 3D-CNN (3D Convolutional Neural Network) models has been regarded as an effective approach for action recognition. In this paper, we explore visual attention mechanism for video analysis and propose a novel 3D-CNN model, dubbed AE-I3D (Attention-Enhanced Inflated-3D Network), for learning attention-enhanced spatiotemporal representation. The contribution of our AE-I3D is threefold: First, we inflate soft attention in spatiotemporal scope for 3D videos, and adopt softmax to generate probability distribution of attentional features in a feedforward 3D-CNN architecture; Second, we devise an AE-Res (Attention-Enhanced Residual learning) module, which learns attention-enhanced features in a two-branch residual learning way, also the AE-Res module is lightweight and flexible, so that can be easily embedded into many 3D-CNN architectures; Finally, we embed multiple AE-Res modules into an I3D (Inflated-3D) network, yielding our AE-I3D model, which can be trained in an end-to-end, video-level manner. Different from previous attention networks, our method inflates residual attention from 2D image to 3D video for 3D attention residual learning to enhance spatiotemporal representation. We use RGB-only video data for evaluation on three benchmarks: UCF101, HMDB51, and Kinetics. The experimental results demonstrate that our AE-I3D is effective with competitive performance.

引用

页码：16785 / 16794

页数：10

共 50 条

[11] HUMAN ACTION REPRESENTATION AND RECOGNITION: AN APPROACH TO A HISTOGRAM OF SPATIOTEMPORAL TEMPLATES
Ahsan, Sk Md. Masudul
Tan, Joo Kooi
Kim, Hyoungseop
Ishikawa, Seiji
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2015, 11 (06): : 1855 - 1867
[12] Better Deep Visual Attention with Reinforcement Learning in Action Recognition
Wang, Gang
Wang, Wenmin
Wang, Jingzhuo
Bu, Yaohua
2017 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2017,
[13] Learning hierarchical video representation for action recognition
Li Q.
Qiu Z.
Yao T.
Mei T.
Rui Y.
Luo J.
International Journal of Multimedia Information Retrieval, 2017, 6 (1) : 85 - 98
[14] Local motion feature extraction and spatiotemporal attention mechanism for action recognition
Song, Xiaogang
Zhang, Dongdong
Liang, Li
He, Min
Hei, Xinhong
VISUAL COMPUTER, 2024, 40 (11) : 7747 - 7759
[15] UNSUPERVISED MOTION REPRESENTATION ENHANCED NETWORK FOR ACTION RECOGNITION
Yang, Xiaohang
Kong, Lingtong
Yang, Jie
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2445 - 2449
[16] Learning Spatiotemporal-Selected Representations in Videos for Action Recognition
Zhang, Jiachao
Tong, Ying
Jiao, Liangbao
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (12)
[17] Action Recognition Using Visual Attention with Reinforcement Learning
Li, Hongyang
Chen, Jun
Hu, Ruimin
Yu, Mei
Chen, Huafeng
Xu, Zengmin
MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 365 - 376
[18] Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition
Xiang, Wangmeng
Li, Chao
Wang, Biao
Wei, Xihan
Hua, Xian-Sheng
Zhang, Lei
COMPUTER VISION - ECCV 2022, PT III, 2022, 13663 : 627 - 644
[19] Residual attention unit for action recognition
Liao, Zhongke
Hu, Haifeng
Zhang, Junxuan
Yin, Chang
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 189
[20] An Improved Attention-Based Spatiotemporal-Stream Model for Action Recognition in Videos
Liu, Dan
Ji, Yunfeng
Ye, Mao
Gan, Yan
Zhang, Jianwei
IEEE ACCESS, 2020, 8 : 61462 - 61470

← 1 2 3 4 5 →