Dynamic Language Model Adaptation Using Presentation Slides for Lecture Speech Recognition

被引:0
作者
Yamazaki, Hiroki [1 ]
Iwano, Koji [1 ]
Shinoda, Koichi [1 ]
Furui, Sadaoki [1 ]
Yokota, Haruo [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Tokyo, Japan
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
language model adaptation; speech recognition; classroom lecture speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a dynamic language model adaptation method that uses the temporal information from lecture slides for lecture speech recognition. The proposed method consists of two steps. First, the language model is adapted with the text information extracted from all the slides of a given lecture. Next, the text information of a given slide is extracted based on temporal information and used for local adaptation. Hence, the language model, used to recognize speech associated with the given slide changes dynamically from one slide to the next. We evaluated the proposed method with the speech data from four Japanese lecture courses. Our experiments show the effectiveness of our proposed method, especially for keyword detection. The F-measure error rate for lecture keywords was reduced by 2.4%.
引用
收藏
页码:89 / 92
页数:4
相关论文
共 21 条
[1]   Classroom 2000: An experiment with the instrumentation of a living educational environment [J].
Abowd, GD .
IBM SYSTEMS JOURNAL, 1999, 38 (04) :508-530
[2]  
*CARN MELL U INF P, INF 2 DIG VID LIB
[3]   LODEM: A system for on-demand video lectures [J].
Fujii, A ;
Itou, K ;
Ishikawa, T .
SPEECH COMMUNICATION, 2006, 48 (05) :516-531
[4]   Recent progress in corpus-based spontaneous speech recognition [J].
Furui, S .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03) :366-375
[5]  
GLASS J, 2004, P HUM LANG TECHN NAA
[6]  
LAMEL L, 2004, P ICSLP, V4, P1795
[7]  
LAMEL L, 2005, P INTERSPEECH 2005, P1675
[8]   MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS [J].
LEGGETTER, CJ ;
WOODLAND, PC .
COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) :171-185
[9]  
Maekawa K., 2000, P LANGUAGE RESOURCES, V6, P1
[10]   The "Authoring on the Fly" system for automated recording and replay of (tele)presentations [J].
Müller, R ;
Ottmann, T .
MULTIMEDIA SYSTEMS, 2000, 8 (03) :158-176