Content Based Lecture Video Retrieval Using Speech and Video Text Information

被引:103
作者
Yang, Haojin [1 ]
Meinel, Christoph [1 ]
机构
[1] Hasso Plattner Inst Software Syst Engn GmbH HPI, D-14440 Potsdam, Germany
来源
IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES | 2014年 / 7卷 / 02期
关键词
Lecture videos; automatic video indexing; content-based video search; lecture video archives;
D O I
10.1109/TLT.2014.2307305
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the last decade e-lecturing has become more and more popular. The amount of lecture video data on the World Wide Web (WWW) is growing rapidly. Therefore, a more efficient method for video retrieval in WWW or within large lecture video archives is urgently needed. This paper presents an approach for automated video indexing and video search in large lecture video archives. First of all, we apply automatic video segmentation and key-frame detection to offer a visual guideline for the video content navigation. Subsequently, we extract textual metadata by applying video Optical Character Recognition (OCR) technology on key-frames and Automatic Speech Recognition (ASR) on lecture audio tracks. The OCR and ASR transcript as well as detected slide text line types are adopted for keyword extraction, by which both video-and segment-level keywords are extracted for content-based video browsing and search. The performance and the effectiveness of proposed indexing functionalities is proven by evaluation.
引用
收藏
页码:142 / 154
页数:13
相关论文
共 25 条
[1]  
Adcock John., 2010, Proceedings of the international conference on Multimedia, MM '10, P241, DOI DOI 10.1145/1873951.1873986
[2]  
[Anonymous], 2013, GROUND TRUTH DATA
[3]   A linear-time component-labeling algorithm using contour tracing technique [J].
Chang, F ;
Chen, CJ ;
Lu, CJ .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2004, 93 (02) :206-220
[4]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[5]  
Eisenstein J., 2007, Proceedings of the 22Nd National Conference on Artificial Intelligence - Volume, V1, P877
[6]  
Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
[7]  
Glass J., 2004, P WORKSH INT APPR SP, P9
[8]  
Grcar M, 2009, LECT NOTES ARTIF INT, V5782, P730, DOI 10.1007/978-3-642-04174-7_51
[9]  
Haubold A., 2005, 13th Annual ACM International Conference on Multimedia, P51, DOI 10.1145/1101149.1101158
[10]  
Hurst W., 2002, P IADIS INT C WWW IN, P135