A BAG-OF-IMPORTANCE MODEL FOR VIDEO SUMMARIZATION

被引:0
作者
Lu, Shiyang [1 ]
Wang, Zhiyong [1 ]
Song, Yuan [1 ]
Mei, Tao [2 ]
Feng, David Dagan [1 ]
机构
[1] Univ Sydney, Sydney, NSW 2006, Australia
[2] Microsoft Res Asia, Beijing, Peoples R China
来源
ELECTRONIC PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW) | 2013年
关键词
Video summarization; sparse coding; FRAMEWORK; SELECTION;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a novel local feature based approach, namely Bag-of-Importance (BoI) model, for static video summarization, while most of the existing approaches characterize each video frame with global features to derive the importance of each frame. Since local features such as interest points are more discriminative in characterizing visual content, we formulate static video summarization as a problem of identifying representative frames which contain more important local features, where the representativeness of each frame is the aggregation of the importance of the local features contained in the frame. In order to derive the importance of each local feature for a given video, we employ sparse coding to project each local feature into a sparse space, calculate the l(2) norm of the sparse coefficients for each local feature, and generate the BoI representation with the distribution of the importance over all the local features in the video. We further take the perceptual difference among spatial regions of a frame into account, a spatial weighting template is utilized to differentiate the importance of local features for the individual frames. With the proposed video summarization scheme, both the inter-frame and intra-frame properties of local features are exploited, which allows the selected frames capture both the dominant content and discriminative details within a video. Experimental results on a dataset across several genres demonstrate that the proposed approach clearly outperforms the state-of-the-art method.
引用
收藏
页数:6
相关论文
共 23 条
[1]  
[Anonymous], 2006, ADV NEURAL INF PROCE
[2]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[3]  
[Anonymous], 2007, PROC IEEE C COMPUT V, DOI 10.1109/CVPR.2007.383267
[4]   Combining graph connectivity & dominant set clustering for video summarization [J].
Besiris, D. ;
Makedonas, A. ;
Economou, G. ;
Fotopoulos, S. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 44 (02) :161-186
[5]   Fusion of audio and motion information on HMM-based highlight extraction for baseball games [J].
Cheng, Chih-Chieh ;
Hsu, Chiou-Ting .
IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (03) :585-599
[6]   Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection [J].
Cong, Yang ;
Yuan, Junsong ;
Luo, Jiebo .
IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (01) :66-75
[7]  
DeMenthon D., 1998, Proceedings ACM Multimedia 98, P211, DOI 10.1145/290747.290773
[8]  
Gong YH, 2000, PROC CVPR IEEE, P174, DOI 10.1109/CVPR.2000.854772
[9]  
Guan G., 2012, CIRCUITS SYSTEMS VID, P1
[10]  
Lee YJ, 2012, PROC CVPR IEEE, P1346, DOI 10.1109/CVPR.2012.6247820