A multimodal approach for extracting content descriptive metadata from lecture videos

被引:14
作者
Balasubramanian, Vidhya [1 ]
Doraisamy, Sooryanarayan Gobu [2 ]
Kanakarajan, Navaneeth Kumar [2 ]
机构
[1] Amrita Vishwa Vidyapeetham Univ, Amrita Sch Engn, Dept Comp Sci & Engn, Coimbatore, Tamil Nadu, India
[2] Amrita Vishwa Vidyapeetham Univ, Amrita E Learning Res Ctr, Coimbatore, Tamil Nadu, India
关键词
Multimodal metadata extraction; Content descriptive metadata; Keyphrase extraction; Topic based segmentation; Lecture videos;
D O I
10.1007/s10844-015-0356-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapidly increasing availability of e-learning content and lecture videos over the internet, has brought forth an imperative need for developing effective content based retrieval systems. Comprehensive metadata extraction and support for topic-level search within videos are key factors in developing such systems. In this paper, we propose a multimodal metadata extraction system which extracts an optimal set of keyphrases and topic based segments that effectively summarize the content of a lecture video. The extraction process utilizes features from both audio transcripts and slide content in video streams. A hybrid approach combining a Naive Bayes classifier and a rule-based refiner is used for effective retrieval of the metadata in a lecture. The proposed content-descriptive metadata extraction technique has been evaluated using actual lecture videos from different sources, and our results show that our multimodal approach is effective in summarizing the lecture's content, potentially improving the user experience during retrieval and browsing.
引用
收藏
页码:121 / 145
页数:25
相关论文
共 23 条
[1]  
Adcock John., 2010, Proceedings of the international conference on Multimedia, MM '10, P241, DOI DOI 10.1145/1873951.1873986
[2]  
Akiba T., 2009, J INFORM PROCESSING, V17, P82
[3]  
[Anonymous], 2012, IEEE INT C TECHN ENH
[4]  
[Anonymous], 1993, MULTIINTERVAL DISCRE
[5]  
[Anonymous], 2008, Introduction to information retrieval
[6]  
Bohm K., 1994, SIGMOD Record, V23, P21, DOI 10.1145/190627.190635
[7]  
Frantzi KaterinaT., 1996, Proceedings of the 16th conference on Computational linguistics, P41
[8]  
Haubold A., 2005, 13th Annual ACM International Conference on Multimedia, P51, DOI 10.1145/1101149.1101158
[9]  
Haubold A., 2004, Proceedings. IEEE Sixth International Symposium on Multimedia Software, P570
[10]  
Haubold Alexander., 2007, CIVR, P41, DOI DOI 10.1145/1282280.1282286