Video summarization via spatio-temporal deep architecture

被引:30
|
作者
Zhong, Sheng-hua [1 ,2 ]
Wu, Jiaxin [1 ,2 ]
Jiang, Jianmin [1 ,2 ]
机构
[1] Shenzhen Univ, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Video summarization; Convolutional Neural Network (CNN); Class imbalance problem; SMOTE;
D O I
10.1016/j.neucom.2018.12.040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video summarization has unprecedented importance to help us overview current ever-growing amount of video collections. In this paper, we propose a novel dynamic video summarization model based on deep learning architecture. We are the first to solve the imbalanced class distribution problem in video summarization. The over-sampling algorithm is used to balance the class distribution on training data. The novel two-stream deep architecture with the cost-sensitive learning is proposed to handle the class imbalance problem in feature learning. In the spatial stream, RGB images are used to represent the appearance of video frames, and in the temporal stream, multi-frame motion vectors with deep learning framework is firstly introduced to represent and extract temporal information of the input video. The proposed method is evaluated on two standard video summarization datasets and a standard emotional dataset. Empirical validations for video summarization demonstrate that our model achieves performance improvement over the existing and state-of-the-art methods. Moreover, the proposed method is able to highlight the video content with the active level of arousal in affective computing task. In addition, the proposed frame-based model has another advantage. It can automatically preserve the connection between consecutive frames. Although the summary is constructed based on the frame level, the final summary is comprised of informative and continuous segments instead of individual separate frames. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:224 / 235
页数:12
相关论文
共 50 条
  • [1] Whiteboard Video Summarization via Spatio-Temporal Conflict Minimization
    Davila, Kenny
    Zanibbi, Richard
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 355 - 362
  • [2] Deep Video Matting via Spatio-Temporal Alignment and Aggregation
    Sun, Yanan
    Wang, Guanzhi
    Gu, Qiao
    Tang, Chi-Keung
    Tai, Yu-Wing
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6971 - 6980
  • [3] CONTENT ADAPTIVE VIDEO SUMMARIZATION USING SPATIO-TEMPORAL FEATURES
    Nam, Hyunwoo
    Yoo, Chang D.
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4003 - 4007
  • [4] Automatic video summarization driven by a spatio-temporal attention model
    Barland, R.
    Saadane, A.
    HUMAN VISION AND ELECTRONIC IMAGING XIII, 2008, 6806
  • [5] Deep video action clustering via spatio-temporal feature learning
    Peng, Bo
    Lei, Jianjun
    Fu, Huazhu
    Jia, Yalong
    Zhang, Zongqian
    Li, Yi
    NEUROCOMPUTING, 2021, 456 : 519 - 527
  • [6] Feature Pooling Using Spatio-Temporal Constrain for Video Summarization and Retrieval
    Ren, Jie
    Ren, Jinchang
    ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING: FUTURETECH & MUE, 2016, 393 : 381 - 387
  • [7] Spatio-temporal summarization of dance choreographies
    Rallis, Ioannis
    Doulamis, Nikolaos
    Doulamis, Anastasios
    Voulodimos, Athanasios
    Vescoukis, Vassilios
    COMPUTERS & GRAPHICS-UK, 2018, 73 : 88 - 101
  • [8] Global-local spatio-temporal graph convolutional networks for video summarization
    Wu, Guangli
    Song, Shanshan
    Zhang, Jing
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
  • [9] Video saliency prediction via spatio-temporal reasoning
    Chen, Jiazhong
    Li, Zongyi
    Jin, Yi
    Ren, Dakai
    Ling, Hefei
    NEUROCOMPUTING, 2021, 462 : 59 - 68
  • [10] System level design of a spatio-temporal video resampling architecture
    Kuo, Chih-Hung
    Chang, Li-Chuan
    Liu, Zheng-Wei
    Liu, Bin-Da
    PROCEEDINGS OF 2008 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-10, 2008, : 2797 - 2800