SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL

被引:0
|
作者
Yamamoto, Hirofumi [1 ]
Kikui, Genichiro [2 ]
Nakamura, Satoshi [1 ,2 ]
Sagisaka, Yoshinori [1 ,3 ]
机构
[1] Natl Inst Informat & Commun Technol, 2-2-2 Hikaridai, Seika, Kyoto, Japan
[2] ATR Spoken Language Commun Res Labs, Kyoto, Japan
[3] Waseda Univ, GITI, Tokyo, Japan
来源
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年
关键词
Speech Recognition; Language model; Foreign word; Out-of-Vocabulalry word; Hierarchical language model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new speech recognition scheme for foreign out-of-vocabulary words embedded in native-language speech. To recognize foreign names frequently observed in news speech or in translation speech, we adopted a hierarchical language model that had been successfully applied to OOV words covering native vocabularies. In this hierarchical language model, OOV vocabularies are modeled as a word-class model in the upper-layered model, and their statistical phonotactic constraints are modeled in the lower-layered model. Since extra statistics are needed to cover foreign words and their pronunciation differences, we have introduced two techniques. The first is to combine translation target language models and translation source statistics of OOVs using the hierarchical language model. The second is to automatically generate recognition target pronunciations from original pronunciations by syllable-to-syllable mapping. To confirm the validity of this recognition scheme, we have conducted speech recognition experiments using English speech including Japanese personal names as OOV words. The proposed method outperformed the existing algorithm using a lexicon consisting of all the words in the training set. Surprisingly, it achieved better OOV recognition results than the non-OOV condition where all the proper names in the test set are registered in the lexicon.
引用
收藏
页码:1870 / +
页数:2
相关论文
共 50 条
  • [1] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Norihide Kitaoka
    Bohan Chen
    Yuya Obashi
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [2] Dynamic out-of-vocabulary word registration to language model for speech recognition
    Kitaoka, Norihide
    Chen, Bohan
    Obashi, Yuya
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [3] RNN Language Model Estimation for Out-of-Vocabulary Words
    Illina, Irina
    Fohr, Dominique
    HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 199 - 211
  • [4] Out-of-vocabulary word recognition using a hierarchical language model based on multiple Markov models
    Yamamoto, H
    Kokubo, H
    Kikui, G
    Ogawa, Y
    Sagisaka, Y
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2005, 88 (12): : 55 - 64
  • [5] Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity
    Naptali, Welly
    Tsuchiya, Masatoshi
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (09) : 2308 - 2317
  • [6] A phoneme-based approach for eliminating out-of-vocabulary problem of Turkish speech recognition using Hidden Markov Model
    Yavuz, Erdem
    Topuz, Vedat
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2018, 33 (06): : 429 - 445
  • [7] Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods
    Ye, Lingxuan
    Cheng, Gaofeng
    Yang, Runyan
    Yang, Zehui
    Tian, Sanli
    Zhang, Pengyuan
    Yan, Yonghong
    INTERSPEECH 2022, 2022, : 3163 - 3167
  • [8] Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition
    Masumura, Ryo
    Hahm, Seongjun
    Ito, Akinori
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1476 - 1479
  • [9] Single-class Support Vector Machine for an Out-of-Vocabulary Rejection of Isolated Words
    He, Dongzhi
    Hou, Yibin
    Huang, Zhangqin
    Ding, Zhihao
    2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 1376 - 1380
  • [10] Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features
    Asami, Taichi
    Masumura, Ryo
    Aono, Yushi
    Shinoda, Koichi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1320 - 1324