Improved Acoustic Feature Combination for LVCSR by Neural Networks

被引：0

作者：

Plahl, Christian ^{[1
]}

Schlueter, Ralf ^{[1
]}

Ney, Hermann ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Lehrstuhl Informat 6, Dept Comp Sci, Aachen, Germany

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

feature extraction; multi-layer neural network; speech recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper investigates the combination of different acoustic features. Several methods to combine these features such as concatenation or LDA are well known. Even though LDA improves the system, feature combination by LDA has been shown to be suboptimal. We introduce a new method based on neural networks. The posterior estimates derived from the NN lead to a significant improvement and achieve a 6% relative better word error rate (WER). Results are also compared to system combination. While system combination has been reported to outperform all other combination techniques, in this work the proposed NN-based combination outperforms system combination. We achieve a 2% relative better WER, resulting in an improvement of 7% relative to the baseline system. In addition to giving better recognition performance w.r.t. WER, NN-based combination reduces both, training and testing complexity. Overall, we use a single set of acoustic models, together with the training of the NN.

引用

页码：1244 / 1247

页数：4

共 50 条

[41] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
Farokhmanesh, Fatemeh
Sadeghi, Mohammad Taghi
NEURAL PROCESSING LETTERS, 2021, 53 (01) : 701 - 720
[42] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
Fatemeh Farokhmanesh
Mohammad Taghi Sadeghi
Neural Processing Letters, 2021, 53 : 701 - 720
[43] Feature selection with neural networks
Verikas, A
Bacauskiene, M
PATTERN RECOGNITION LETTERS, 2002, 23 (11) : 1323 - 1335
[44] Neural networks for feature selection
Pal, NR
PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1121 - 1124
[45] Feature Selection With Neural Networks
Philippe Leray
Patrick Gallinari
Behaviormetrika, 1999, 26 (1) : 145 - 166
[46] LDA Based Feature Estimation Methods for LVCSR
Pylkkoenen, Janne
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 389 - 392
[47] Investigation of acoustic modeling techniques for LVCSR systems
Liu, X
Gales, MJF
Sim, KC
Yu, K
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 849 - 852
[48] Acoustic Feature Prediction from Semantic Features for Expressive Speech using Deep Neural Networks
Jauk, Igor
Bonafonte, Antonio
Pascual, Santiago
2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 2320 - 2324
[49] Optimized combination of neural networks
Benediktsson, JA
Sveinsson, JR
Ersoy, OK
ISCAS 96: 1996 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - CIRCUITS AND SYSTEMS CONNECTING THE WORLD, VOL 3, 1996, : 535 - 538
[50] A COMPARATIVE STUDY ON SYSTEM COMBINATION SCHEMES FOR LVCSR
Ma, Chengyuan
Kuo, Hong-Kwang Jeff
Soltau, Hagen
Cui, Xiaodong
Chaudhari, Upendra
Mangu, Lidia
Lee, Chin-Hui
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4394 - 4397

← 1 2 3 4 5 →