A DEEP LEARNING-BASED APPROACH FOR CAMERA MOTION CLASSIFICATION

被引：1

作者：

Ouenniche, Kaouther ^{[1
]}

Tapu, Ruxandra ^{[1
]}

Zaharia, Titus ^{[1
]}

机构：

[1] Inst Polytech Paris, Lab SAMOVAR, Telecom SudParis, 9 Rue Charles Fourier, F-91011 Evry, France

来源：

PROCEEDINGS OF THE 2021 9TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP) | 2021年

关键词：

Camera motion classification; deep learning; Resnet; 3D CNN;

D O I：

10.1109/EUVIP50544.2021.9483961

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The automatic estimation of the various types of camera motion (e.g., traveling, panning, rolling, zoom.) that are present in videos represents an important challenge for automatic video indexing. Previous research works are mainly based on optical flow estimation and analysis. In this paper, we propose a different, deep learning-based approach that makes it possible to classify the videos according to the type of camera motion. The proposed method is inspired from action recognition approaches and exploits 3D convolutional neural networks with residual blocks. The performances are objectively evaluated on challenging videos, involving blurry frames, fast/slow motion, poorly textured scenes. The accuracy rates obtained (with an average score of 94%) demonstrate the robustness of the proposed model.

引用

页数：6

共 23 条

[1]

[Anonymous], 2012, ADV RES APPL ARTIFIC

[2]

[Anonymous], ABOUT US, DOI DOI 10.1039/B820528K

[3] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[4]

Cho K., 2014, C EMP METH NAT LANG, P1724, DOI [10.3115/v1/D14-1179, DOI 10.3115/V1/D14-1179]

[5] Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].

Donahue, Jeff ;

Hendricks, Lisa Anne ;

Rohrbach, Marcus ;

Venugopalan, Subhashini ;

Guadarrama, Sergio ;

Saenko, Kate ;

Darrell, Trevor .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691

[6] Learning Spatiotemporal Features with 3D Convolutional Networks [J].

Du Tran ;

Bourdev, Lubomir ;

Fergus, Rob ;

Torresani, Lorenzo ;

Paluri, Manohar .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497

[7] Convolutional Two-Stream Network Fusion for Video Action Recognition [J].

Feichtenhofer, Christoph ;

Pinz, Axel ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1933-1941

[8]

Gillespie W. J., 2004, TENCON 2004. 2004 IEEE Region 10 Conference (IEEE Cat. No. 04CH37582), P395

[9] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? [J].

Hara, Kensho ;

Kataoka, Hirokatsu ;

Satoh, Yutaka .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6546-6555

[10]

Hara K, 2018, INT C PATT RECOG, P2516, DOI 10.1109/ICPR.2018.8546325

← 1 2 3 →