Multi-softmax Deep Neural Network for Semi-supervised Training

被引：0

作者：

Su, Hang ^{[1
,2
]}

Xu, Haihua ^{[3
]}

机构：

[1] Int Comp Sci Inst, Berkeley, CA 94704 USA

[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

[3] Nanyang Technol Univ, Singapore, Singapore

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

Semi-supervised training; Low resources; Deep Neural Networks;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper we propose a Shared Hidden Layer Multi-softmax Deep Neural Network (SHL-MDNN) approach for semi-supervised training (SST). This approach aims to boost low-resource speech recognition where limited training data is available. Supervised data and unsupervised data share the same hidden layers but are fed into different softmax layers so that erroneous automatic speech recognition (ASR) transcriptions of the unsupervised data have less effect on shared hidden layers. Experimental results on Babel data indicate that this approach always outperform naive SST on DNN, and it can yield 1.3% word error rate (WER) reduction compared with supervised DNN hybrid system. In addition, if softmax layer is retrained with supervised data, it can lead up to another 0.8% WER reduction. Confidence based data selection is also studied in this setup. Experiments show that this method is not sensitive to ASR transcription errors.

引用

页码：3239 / 3243

页数：5

共 28 条

[1]

[Anonymous], P INTERSPEECH

[2]

[Anonymous], P INTERSPEECH

[3]

[Anonymous], 2014, 15 ANN C INT SPEECH

[4]

[Anonymous], ICASSP

[5]

[Anonymous], 2013, ICASSP

[6]

Bourlard H.A., 1994, Connectionist speech recognition: a hybrid approach, V247

[7]

Gales M.F., 1996, GENERATION USE REGRE

[8]

Ghahremani P., 2014, ICASSP

[9]

Ghoshal A., 2013, AC SPEECH SIGN PROC

[10]

Grezl F., 2014, SLT

← 1 2 3 →