Tree-structured model selection and simulated-data adaptation for environmental and speaker robust speech recognition

被引：0

作者：

Thatphithakkul, Nattanun ^{[1
]}

Kruatrachue, Boontee ^{[1
]}

Wutiwiwatchai, Chai ^{[2
]}

Marukatat, Sanparith ^{[2
]}

Boonpiam, Vataya ^{[2
]}

机构：

[1] King Mongkuts Inst Technol Ladkrabang, Dept Comp Engn, Bangkok 10520, Thailand

[2] Natl Elect & Comp Technol Ctr, Human Language Technol Lab, Pathum Thani 12120, Thailand

来源：

2007 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, VOLS 1-3 | 2007年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes the use of tree-structured model selection and simulated-data in maximum likelihood linear regression (MLLR) adaptation for environment and speaker robust speech recognition. The objective of this work is to solve major problems in robust speech recognition system, namely unknown speaker and unknown environmental noise. The proposed solution is composed of two components. The first one is based on a tree-structured model for selecting a speaker-dependent model that best matches to the input speech. The second component uses simulated-data to adapt the selected acoustic model to fit with the unknown noise. The proposed technique can thus alleviate both problems simultaneously. Experimental results show that the proposed system achieves a higher recognition rate than the system using only the input speech in adaptation and the system using a multi-conditioned acoustic model.

引用

页码：1570 / +

页数：2

共 50 条

[21] Adaptation of hidden Markov model for telephone speech recognition and speaker adaptation
Chien, JT
Wang, HC
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (03): : 129 - 135
[22] Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition
Kim, Jae-Bok
Park, Jeong-Sik
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 126 - 134
[23] Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification
Shih, Po-Yi
Lin, Po-Chuan
Wang, Jhing-Fa
Lin, Yuan-Ning
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (05) : 1459 - 1467
[24] Fast model selection based speaker adaptation for nonnative speech
He, XD
Zhao, YX
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (04): : 298 - 307
[25] Speaker segmentation and adaptation for speech recognition on multiple-speaker audio conference data
Liu, Zhu
Saraclar, Murat
2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 192 - +
[26] A posterior union model with applications to robust speech and speaker recognition
Ming, Ji
Lin, Jie
Smith, F. Jack
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)
[27] A Posterior Union Model with Applications to Robust Speech and Speaker Recognition
Ji Ming
Jie Lin
F. Jack Smith
EURASIP Journal on Advances in Signal Processing, 2006
[28] Tree-structured prognostic classification for censored survival data: Validation of computationally inexpensive model selection criteria
Negassa, A
Ciampi, A
Abrahamowicz, M
Shapiro, S
Boivin, JF
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2000, 67 (04) : 289 - 317
[29] Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation
Gales, MJF
Pye, D
Woodland, PC
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1832 - 1835
[30] Multi-Speaker Adaptation for Robust Speech Recognition under Ubiquitous Environment
Shih, Po-Yi
Wang, Jhing-Fa
Lin, Yuan-Ning
Fu, Zhong-Hua
ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 126 - 131

← 1 2 3 4 5 →