Unsupervised topic adaptation for morph-based speech recognition

被引:0
|
作者
Mansikkaniemi, Andre [1 ]
Kurimo, Mikko [2 ]
机构
[1] Aalto Univ, Sch Sci, Dept Informat & Comp Sci, Espoo, Finland
[2] Aalto Univ, Sch Elect Engn, Dept Signal Proc & Acoust, Espoo, Finland
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
基金
芬兰科学院;
关键词
unsupervised language model adaptation; vocabulary; adaptation; morph-based speech recognition; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic adaptation in automatic speech recognition (ASR) refers to the adaptation of language model and vocabulary for improved recognition of in-domain speech data. In this work we implement unsupervised topic adaptation for morph-based ASR, to improve recognition of foreign entity names. Based on first-pass ASR hypothesis similar texts are selected from a collection of articles, which are used to adapt the background language model. Latent semantic indexing is used to index the adaptation corpus and ASR output. We evaluate three different types of index terms and their usefulness in unsupervised LM adaptation: statistical morphs, words, and a combination of morphs and words. Furthermore, we implement vocabulary adaptation alongside unsupervised LM adaptation. Foreign word candidates are selected from the in-domain texts, based on how likely they are topic-related foreign entity names. Adapted pronunciation rules are generated for the selected foreign words. Morpheme adaptation is also performed by restoring over-segmented foreign words back into their base forms, to ensure more reliable pronunciation modeling.
引用
收藏
页码:2692 / 2696
页数:5
相关论文
共 50 条
  • [1] Adaptation of Morph-Based Speech Recognition for Foreign Names and Acronyms
    Mansikkaniemi, Andre
    Kurimo, Mikko
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (05) : 941 - 950
  • [2] Investigation of Morph-based Speech Recognition Improvements across Speech Genres
    Mihajlik, Peter
    Tarjan, Balazs
    Tueske, Zoltan
    Fegyo, Tibor
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2655 - 2658
  • [3] ANALYSIS OF MORPH-BASED LANGUAGE MODELING AND SPEECH RECOGNITION IN SLOVAK
    Stas, Jan
    Hladek, Daniel
    Juhar, Jozef
    Zlacky, Daniel
    ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING, 2012, 10 (04) : 291 - 296
  • [4] Importance of High-Order N-Gram Models in Morph-Based Speech Recognition
    Hirsimaki, Teemu
    Pylkkonen, Janne
    Kurimo, Mikko
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 724 - 732
  • [5] A morph-based and a word-based treebank for Beja
    Kahane, Sylvain
    Vanhove, Martine
    Ziane, Rayan
    Guillaume, Bruno
    TLT 2021 - 20th International Workshop on Treebanks and Linguistic Theories, Proceedings - To be held as part of SyntaxFest 2021, 2021, : 48 - 60
  • [6] Unsupervised domain adaptation for speech recognition with unsupervised error correction
    Mai, Long
    Carson-Berndsen, Julie
    INTERSPEECH 2022, 2022, : 5120 - 5124
  • [7] Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Zhang, Zixing
    Eyben, Florian
    Schuller, Bjoern
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1068 - 1072
  • [8] N-Best-based unsupervised speaker adaptation for speech recognition
    Matsui, T
    Furui, S
    COMPUTER SPEECH AND LANGUAGE, 1998, 12 (01): : 41 - 50
  • [9] A confidence-score based unsupervised map adaptation for speech recognition
    Wang, DG
    Narayanan, SS
    THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 222 - 226
  • [10] Unsupervised class-based language model adaptation for spontaneous speech recognition
    Yokoyama, T
    Shinozaki, T
    Iwano, K
    Furui, S
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 236 - 239