Action Recognition Based on Efficient Deep Feature Learning in the Spatio-Temporal Domain

被引：24

作者：

Husain, Farzad ^{[1
]}

Dellen, Babette ^{[2
]}

Torras, Carme ^{[1
]}

机构：

[1] UPC, CSIC, Inst Robot & Informat Ind, Barcelona 08028, Spain

[2] Hsch Koblenz, RheinAhrCampus, D-53424 Remagen, Germany

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2016年 / 1卷 / 02期

关键词：

Computer vision for automation; recognition; visual learning;

D O I：

10.1109/LRA.2016.2529686

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Hand-crafted feature functions are usually designed based on the domain knowledge of a presumably controlled environment and often fail to generalize, as the statistics of real-world data cannot always be modeled correctly. Data-driven feature learning methods, on the other hand, have emerged as an alternative that often generalize better in uncontrolled environments. We present a simple, yet robust, 2-D convolutional neural network extended to a concatenated 3-D network that learns to extract features from the spatio-temporal domain of raw video data. The resulting network model is used for content-based recognition of videos. Relying on a 2-D convolutional neural network allows us to exploit a pretrained network as a descriptor that yielded the best results on the largest and challenging ILSVRC-2014 dataset. Experimental results on commonly used benchmarking video datasets demonstrate that our results are state-of-the-art in terms of accuracy and computational time without requiring any preprocessing (e.g., optic flow) or a priori knowledge on data capture (e.g., camera motion estimation), whichmakes it more general and flexible than other approaches. Our implementation is made available.

引用

页码：984 / 991

页数：8

共 50 条

[1] Recognizing Gaits on Spatio-Temporal Feature Domain
Kusakunniran, Worapan
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2014, 9 (09) : 1416 - 1423
[2] Spatio-Temporal Feature Extraction/Recognition in Videos Based on Energy Optimization
Sakaino, Hidetomo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) : 3395 - 3407
[3] Learning Sequence Descriptor Based on Spatio-Temporal Attention for Visual Place Recognition
Zhao, Junqiao
Zhang, Fenglin
Cai, Yingfeng
Tian, Gengxuan
Mu, Wenjie
Ye, Chen
Feng, Tiantian
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (03) : 2351 - 2358
[4] Fluxformer: Flow-Guided Duplex Attention Transformer via Spatio-Temporal Clustering for Action Recognition
Hong, Younggi
Kim, Min Ju
Lee, Isack
Yoo, Seok Bong
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6411 - 6418
[5] Learning Deep Spatio-Temporal Dependence for Semantic Video Segmentation
Qiu, Zhaofan
Yao, Ting
Mei, Tao
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (04) : 939 - 949
[6] Deep Learning Model for Global Spatio-Temporal Image Prediction
Nikezic, Dusan P.
Ramadani, Uzahir R.
Radivojevic, Dusan S.
Lazovic, Ivan M.
Mirkov, Nikola S.
MATHEMATICS, 2022, 10 (18)
[7] Gait feature learning via spatio-temporal two-branch networks
Chen, Yifan
Li, Xuelong
PATTERN RECOGNITION, 2024, 147
[8] Spatio-temporal features based deep learning model for depression detection using two electrodes
Choudhary, Shubham
Bajpai, Manish Kumar
Bharti, Kusum Kumari
MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
[9] Learning motion representation for real-time spatio-temporal action localization
Zhang, Dejun
He, Linchao
Tu, Zhigang
Zhang, Shifu
Han, Fei
Yang, Boxiong
PATTERN RECOGNITION, 2020, 103
[10] HOG and HOOF Spatio-Temporal Descriptors for Gesture Recognition
Agab, Salah Eddine
Chelali, Fatma Zohra
2018 INTERNATIONAL CONFERENCE ON SIGNAL, IMAGE, VISION AND THEIR APPLICATIONS (SIVA), 2018,

← 1 2 3 4 5 →