Multi-softmax Deep Neural Network for Semi-supervised Training

被引:0
作者
Su, Hang [1 ,2 ]
Xu, Haihua [3 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA 94704 USA
[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[3] Nanyang Technol Univ, Singapore, Singapore
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
Semi-supervised training; Low resources; Deep Neural Networks;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose a Shared Hidden Layer Multi-softmax Deep Neural Network (SHL-MDNN) approach for semi-supervised training (SST). This approach aims to boost low-resource speech recognition where limited training data is available. Supervised data and unsupervised data share the same hidden layers but are fed into different softmax layers so that erroneous automatic speech recognition (ASR) transcriptions of the unsupervised data have less effect on shared hidden layers. Experimental results on Babel data indicate that this approach always outperform naive SST on DNN, and it can yield 1.3% word error rate (WER) reduction compared with supervised DNN hybrid system. In addition, if softmax layer is retrained with supervised data, it can lead up to another 0.8% WER reduction. Confidence based data selection is also studied in this setup. Experiments show that this method is not sensitive to ASR transcription errors.
引用
收藏
页码:3239 / 3243
页数:5
相关论文
共 28 条
  • [1] [Anonymous], P INTERSPEECH
  • [2] [Anonymous], P INTERSPEECH
  • [3] [Anonymous], 2014, 15 ANN C INT SPEECH
  • [4] [Anonymous], ICASSP
  • [5] [Anonymous], 2013, ICASSP
  • [6] Bourlard H.A., 1994, Connectionist speech recognition: a hybrid approach, V247
  • [7] Gales M.F., 1996, GENERATION USE REGRE
  • [8] Ghahremani P., 2014, ICASSP
  • [9] Ghoshal A., 2013, AC SPEECH SIGN PROC
  • [10] Grezl F., 2014, SLT