SEMI-SUPERVISED BOOTSTRAPPING APPROACH FOR NEURAL NETWORK FEATURE EXTRACTOR TRAINING

被引:0
作者
Grezl, Frantisek [1 ]
Karafiat, Martin [1 ]
机构
[1] Brno Univ Technol, Speech FIT & IT4I Ctr Excellence, CS-61090 Brno, Czech Republic
来源
2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU) | 2013年
关键词
Semi-supervised training; bootstrapping; bottle-neck features;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents bootstrapping approach for neural network training. The neural networks serve as bottle-neck feature extractor for subsequent GMM-HMM recognizer. The recognizer is also used for transcription and confidence assignment of untranscribed data. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. With this approach, it is possible to recover 40-55% of the difference between partially and fully transcribed data (3 to 5% absolute improvement over NN trained on supervised data only). Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result.
引用
收藏
页码:470 / 475
页数:6
相关论文
共 19 条
[11]  
Schwarz P., 2009, Phoneme Recognition based on Long Temporal Context
[12]  
Subramanya Amarnag, 2009, P INTERSPEECH 2009 S
[13]  
Talkin D, 1995, Speech Coding Synth, V495, P518
[14]  
Vesely K., 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), P42, DOI 10.1109/ASRU.2011.6163903
[15]  
Vesely Karel, 2013, P ASRU 2013 DEC
[16]  
Wang Lan, 2007, P ICASSP, V4
[17]   Unsupervised training of acoustic models for large vocabulary continuous speech recoornition [J].
Wessel, F ;
Ney, H .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (01) :23-31
[18]   Confidence measures for large vocabulary continuous speech recognition [J].
Wessel, F ;
Schlüter, R ;
Macherey, K ;
Ney, H .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03) :288-298
[19]  
ZHANG B, 2006, P INT 2006 PITTSB PA, P2977