Semantically Similar Document Retrieval Framework for Language Model Speaker Adaptation

被引:0
作者
Stas, Jan [1 ]
Zlacky, Daniel [1 ]
Hladek, Daniel [1 ]
机构
[1] Tech Univ Kosice, Fac Elect Engn & Informat, Dept Elect & Multimedia Commun, Pk Komenskeho 13, Kosice 04120, Slovakia
来源
PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA 2016) | 2016年
关键词
automatic speech recognition; document retrieval; latent semantic indexing; latent Dirichlet allocation; language modeling; speaker adaptation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The paper deals with semantically similar document retrieval framework for language model adaptation in Slovak to a specific speaker speaking style. This research extends our previous study oriented on language model speaker adaptation for transcription of Slovak parliament proceedings with available speaker-specific text data. We used a large corpora for retrieving semantically similar subset of text documents for each speaker to adjust parameters of an existing well-trained language model to a specific speaker speaking style. The same large corpora was used to build original topic-specific model of the Slovak language deployed in our automatic subtitling system. In the proposed framework, the latent semantic indexing was implemented to retrieve the subset of semantically similar documents. The output hypotheses from the first step of speech recognition were used to identify patterns between terms and concepts contained in an unstructured collection of text documents. Preliminary results show a slight improvement in speech recognition accuracy for individual speaker in fully automatic subtitling of parliament speech, broadcast news TV shows and TEDx talks.
引用
收藏
页码:403 / 407
页数:5
相关论文
共 17 条
  • [1] Exploiting latent semantic information in statistical language modeling
    Bellegarda, JR
    [J]. PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1279 - 1296
  • [2] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [3] Creutz M., 2005, A81 HELS U TECHN DEP
  • [4] Darjaa S, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P1728
  • [5] Kawahara T, 2008, INT CONF ACOUST SPEE, P4929
  • [6] Lee A., 2001, EUROSPEECH, P1691
  • [7] Marquard S., 2012, THESIS
  • [8] Dynamic language modeling for a daily broadcast news transcrieption system
    Martins, Ciro
    Teixeira, Antonio
    Neto, Joao
    [J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 165 - +
  • [9] Munteanu Cosmin., 2007, INTERSPEECH, P2353
  • [10] Language model and speaking rate adaptation for spontaneous presentation speech recognition
    Nanjo, H
    Kawahara, T
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (04): : 391 - 400