A DEEP LEARNING-BASED APPROACH FOR CAMERA MOTION CLASSIFICATION

被引:2
作者
Ouenniche, Kaouther [1 ]
Tapu, Ruxandra [1 ]
Zaharia, Titus [1 ]
机构
[1] Inst Polytech Paris, Lab SAMOVAR, Telecom SudParis, 9 Rue Charles Fourier, F-91011 Evry, France
来源
PROCEEDINGS OF THE 2021 9TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP) | 2021年
关键词
Camera motion classification; deep learning; Resnet; 3D CNN;
D O I
10.1109/EUVIP50544.2021.9483961
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The automatic estimation of the various types of camera motion (e.g., traveling, panning, rolling, zoom.) that are present in videos represents an important challenge for automatic video indexing. Previous research works are mainly based on optical flow estimation and analysis. In this paper, we propose a different, deep learning-based approach that makes it possible to classify the videos according to the type of camera motion. The proposed method is inspired from action recognition approaches and exploits 3D convolutional neural networks with residual blocks. The performances are objectively evaluated on challenging videos, involving blurry frames, fast/slow motion, poorly textured scenes. The accuracy rates obtained (with an average score of 94%) demonstrate the robustness of the proposed model.
引用
收藏
页数:6
相关论文
共 23 条
  • [1] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [2] Cho K., 2014, P C EMP METH NAT LAN, P1724, DOI DOI 10.3115/V1/D14-1179
  • [3] Long-Term Recurrent Convolutional Networks for Visual Recognition and Description
    Donahue, Jeff
    Hendricks, Lisa Anne
    Rohrbach, Marcus
    Venugopalan, Subhashini
    Guadarrama, Sergio
    Saenko, Kate
    Darrell, Trevor
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) : 677 - 691
  • [4] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [5] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
  • [6] Gillespie W. J., 2004, TENCON 2004. 2004 IEEE Region 10 Conference (IEEE Cat. No. 04CH37582), P395
  • [7] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
    Hara, Kensho
    Kataoka, Hirokatsu
    Satoh, Yutaka
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6546 - 6555
  • [8] Hara K, 2018, INT C PATT RECOG, P2516, DOI 10.1109/ICPR.2018.8546325
  • [9] CAMHID: Camera Motion Histogram Descriptor and Its Application to Cinematographic Shot Classification
    Hasan, Muhammad Abul
    Xu, Min
    He, Xiangjian
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (10) : 1682 - 1695
  • [10] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778