Action Recognition Using Temporal Partitioning of Motion Information

被引：0

作者：

Amirjan, Pouria ^{[1
]}

Mansouri, Azadeh ^{[1
]}

机构：

[1] Kharazmi Univ, Fac Elect & Comp Engn, Dept Engn, Tehran, Iran

来源：

2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019) | 2019年

关键词：

component; Action Recognition; First-person Video; Third Person Video; Sub-events; Pyramid Pooling;

D O I：

10.1109/iraniancee.2019.8786379

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, a temporal representation method for video action recognition is proposed. Since the intrinsic property of the video stream is its temporal variation, the optical flow images are calculated to show the short-term motion. In order to avoid training a complex network from scratch, a pre-trained network is utilized for frame-level feature extraction. For video level representation, pyramidal pooled time series is considered since the short-term variation can be captured in order to represent fixed-size long-term features. In addition, to solve the information missing problem through long videos, a simple video level representation using temporal partitioning is proposed too. The experimental results of the proposed method illustrates an acceptable performance both in first and third-person action recognition.

引用

页码：1946 / 1950

页数：5

共 21 条

[1] Banerjee B, 2017, INT CONF ACOUST SPEE, P2637, DOI 10.1109/ICASSP.2017.7952634
[2] Bregonzio M, 2009, PROC CVPR IEEE, P1948, DOI 10.1109/CVPRW.2009.5206779
[3] LIBSVM: A Library for Support Vector Machines
Chang, Chih-Chung
Lin, Chih-Jen
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4] Csurka G., 2004, WORKSHOP STAT LEARNI, P1, DOI DOI 10.1234/12345678
[5] Csurka G, 2011, COMM COM INF SC, V229, P28
[6] Dollar P., 2005, VISUAL SURVEILLANCE, V14, P65, DOI DOI 10.1109/VSPETS.2005.1570899
[7] Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran
Bourdev, Lubomir
Fergus, Rob
Torresani, Lorenzo
Paluri, Manohar
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
[8] Going deeper into action recognition: A survey
Herath, Samitha
Harandi, Mehrtash
Porikli, Fatih
[J]. IMAGE AND VISION COMPUTING, 2017, 60 : 4 - 21
[9] Javidani A., 2018, ARXIV180100192
[10] Jin JL, 2017, IEEE IMAGE PROC, P4507, DOI 10.1109/ICIP.2017.8297135

← 1 2 3 →