Unsupervised Cross-Adaptation Using Language Model and Deep Learning Based Acoustic Model Adaptations

被引：0

作者：

Takagi, Akira ^{[1
]}

Konno, Kazuki ^{[1
]}

Kato, Masaharu ^{[1
]}

Kosaka, Tetsuo ^{[1
]}

机构：

[1] Yamagata Univ, Grad Sch Sci & Engn, Yonezawa, Yamagata, Japan

来源：

2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA) | 2014年

基金：

日本学术振兴会;

关键词：

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

It is well known that deep learning-based speech recognition improves performance significantly. In deep learning based systems, the deep neural network hidden Markov model (DNN-HMM) is used as an acoustic model (AM). Recently, speaker adaptation techniques based on DNN-HMM have also been investigated. The aim of this work is to improve the performance of unsupervised batch adaptation using DNN-HMM. The proposed adaptation method is based on the cross-adaptation approach, where complementary information derived from several systems is used. Gaussian mixture model HMM (GMM-HMM), DNN-HMM, and language model (LM) adaptation processes are conducted sequentially in the cross-adaptation procedure. The proposed adaptation method was evaluated on a Japanese lecture speech recognition task, reducing the error rate by 13.5% compared to the baseline DNN-HMM-based large vocabulary continuous speech recognition system.

引用

页数：4

共 15 条

[1] Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].

Dahl, George E. ;

Yu, Dong ;

Deng, Li ;

Acero, Alex .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42

[2] Analysis and recognition of spontaneous speech using Corpus of Spontaneous Japanese [J].

Furui, S ;

Nakamura, M ;

Ichiba, T ;

Iwano, K .

SPEECH COMMUNICATION, 2005, 47 (1-2) :208-219

[3] Mean and variance adaptation within the MLLR framework [J].

Gales, MJF ;

Woodland, PC .

COMPUTER SPEECH AND LANGUAGE, 1996, 10 (04) :249-264

[4]

Kosaka T., 2011, P APSIPA ASC 2011

[5]

Kosaka T., 2010, P INTERSP 2010

[6]

Liao H, 2013, INT CONF ACOUST SPEE, P7947, DOI 10.1109/ICASSP.2013.6639212

[7]

Liu X, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P342

[8]

Liu X., 2011, INTERSPEECH 2011, P2857

[9]

MOORE G, 2000, P ICSLP, P512

[10]

Ochiai T., 2014, P ICASP 2014

← 1 2 →