LANGUAGE MODEL ADAPTATION FOR ACADEMIC LECTURES USING CHARACTER RECOGNITION RESULT OF PRESENTATION SLIDES

被引:0
作者
Akita, Yuya [1 ]
Tong, Yizheng [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Sch Informat, Kyoto 6068501, Japan
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
Language model; adaptation; lectures; character recognition; presentation slides; SPEECH RECOGNITION; TRANSCRIPTION; INFORMATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For automatic speech recognition (ASR) of lectures, texts of presentation slides are expected to be useful for adapting a language model, while slide texts are not always available in a machine-readable form. In this paper, we propose a language model adaptation framework that uses character recognition results of slide images in a lecture video. Since character recognition results contain many errors, we introduce a filtering method based on morphological and topic information. Then we perform linear interpolation of the baseline language model with the filtered results and also relevant texts which are selected automatically from a text database using the filtered results. We further conduct a cache-based adaptation method on the resulting language model, in which keywords in the filtered results are cached and used to boost the word probability. In an experimental evaluation over real lectures, we obtained a significant improvement of ASR performance by this adaptation framework.
引用
收藏
页码:5431 / 5435
页数:5
相关论文
共 21 条
  • [1] Adcock John, 2010, P ACM MULT, P1507, DOI [10.1145/1873951.1874263, DOI 10.1145/1873951.1874263]
  • [2] Akita Y., 2012, P INTERSPEECH, P3343
  • [3] [Anonymous], 2007, INTERSPEECH 2007
  • [4] [Anonymous], P INT 2013 LYON
  • [5] [Anonymous], P ASRU
  • [6] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [7] Cerva P., 2012, P INTERSPEECH, P3343
  • [8] Cho E., 2013, INTERSPEECH, P3473
  • [9] FURUI S, 2000, P ICSLP, V3, P518
  • [10] Probabilistic latent semantic indexing
    Hofmann, T
    [J]. SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 50 - 57