Connectionist speech recognition of Broadcast News

被引:18
作者
Robinson, AJ
Cook, GD
Ellis, DPW
Fosler-Lussier, E
Renals, SJ
Williams, DAG
机构
[1] SoftSound Ltd, Autonomy Syst Ltd, Cambridge CB4 0WS, England
[2] Phonet Syst UK Ltd, Cheltenham GL52 8RW, Glos, England
[3] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[4] Bell Labs, Lucent Technol, Murray Hill, NJ 07974 USA
[5] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
[6] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[7] Int Comp Sci Inst, Berkeley, CA 94704 USA
关键词
speech recognition; neural networks; acoustic features; pronunciation modelling; search techniques; stack decoder;
D O I
10.1016/S0167-6393(01)00058-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes connectionist techniques for recognition of Broadcast News. The fundamental difference between connectionist systems and more conventional mixture-of-Gaussian systems is that connectionist models directly estimate posterior probabilities as opposed to likelihoods. Access to posterior probabilities has enabled us to develop a number of novel approaches to confidence estimation, pronunciation modelling and search. In addition we have investigated a new feature extraction technique based on the modulation-filtered spectrogram (MSG), and methods for combining multiple information sources. We have incorporated all of these techniques into a system for the transcription of Broadcast News, and we present results on the 1998 DARPA Hub-4E Broadcast News evaluation data. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:27 / 45
页数:19
相关论文
共 50 条
[1]  
[Anonymous], 1980, TRENDS SPEECH RECOGN
[2]  
[Anonymous], 1996, Automatic Speech and Speaker Recognition
[3]  
BARKER J, 1998, P INT C SPOK LANG PR, P2719
[4]  
BERNARDIS G, 1998, P INT C SPOK LANG PR, P775
[5]  
BOURLARD H, 1996, AUTOMATIC SPEECH SPE, P259
[6]  
BOURLARD H, 1994, P IEEE INT C AC SPEE, P373
[7]  
CLARKSON P, 1997, EUR C SPEECH COMM TE, P2707
[8]  
Cook G, 1998, INT CONF ACOUST SPEE, P917, DOI 10.1109/ICASSP.1998.675415
[9]  
COOK G, 1997, P DARPA SPEECH REC W, P79
[10]  
COX S, 1996, P INT C AC SPEECH SI, P511