Tree-structured model selection and simulated-data adaptation for environmental and speaker robust speech recognition

被引:0
|
作者
Thatphithakkul, Nattanun [1 ]
Kruatrachue, Boontee [1 ]
Wutiwiwatchai, Chai [2 ]
Marukatat, Sanparith [2 ]
Boonpiam, Vataya [2 ]
机构
[1] King Mongkuts Inst Technol Ladkrabang, Dept Comp Engn, Bangkok 10520, Thailand
[2] Natl Elect & Comp Technol Ctr, Human Language Technol Lab, Pathum Thani 12120, Thailand
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes the use of tree-structured model selection and simulated-data in maximum likelihood linear regression (MLLR) adaptation for environment and speaker robust speech recognition. The objective of this work is to solve major problems in robust speech recognition system, namely unknown speaker and unknown environmental noise. The proposed solution is composed of two components. The first one is based on a tree-structured model for selecting a speaker-dependent model that best matches to the input speech. The second component uses simulated-data to adapt the selected acoustic model to fit with the unknown noise. The proposed technique can thus alleviate both problems simultaneously. Experimental results show that the proposed system achieves a higher recognition rate than the system using only the input speech in adaptation and the system using a multi-conditioned acoustic model.
引用
收藏
页码:1570 / +
页数:2
相关论文
共 50 条
  • [21] Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation
    Chien, JT
    Wang, HC
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (03): : 129 - 135
  • [22] Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition
    Kim, Jae-Bok
    Park, Jeong-Sik
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 126 - 134
  • [23] Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification
    Shih, Po-Yi
    Lin, Po-Chuan
    Wang, Jhing-Fa
    Lin, Yuan-Ning
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (05) : 1459 - 1467
  • [24] Fast model selection based speaker adaptation for nonnative speech
    He, XD
    Zhao, YX
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (04): : 298 - 307
  • [25] Speaker segmentation and adaptation for speech recognition on multiple-speaker audio conference data
    Liu, Zhu
    Saraclar, Murat
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 192 - +
  • [26] A posterior union model with applications to robust speech and speaker recognition
    Ming, Ji
    Lin, Jie
    Smith, F. Jack
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)
  • [27] A Posterior Union Model with Applications to Robust Speech and Speaker Recognition
    Ji Ming
    Jie Lin
    F. Jack Smith
    EURASIP Journal on Advances in Signal Processing, 2006
  • [28] Tree-structured prognostic classification for censored survival data: Validation of computationally inexpensive model selection criteria
    Negassa, A
    Ciampi, A
    Abrahamowicz, M
    Shapiro, S
    Boivin, JF
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2000, 67 (04) : 289 - 317
  • [29] Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation
    Gales, MJF
    Pye, D
    Woodland, PC
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1832 - 1835
  • [30] Multi-Speaker Adaptation for Robust Speech Recognition under Ubiquitous Environment
    Shih, Po-Yi
    Wang, Jhing-Fa
    Lin, Yuan-Ning
    Fu, Zhong-Hua
    ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 126 - 131