Improved Acoustic Feature Combination for LVCSR by Neural Networks

被引:0
|
作者
Plahl, Christian [1 ]
Schlueter, Ralf [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Lehrstuhl Informat 6, Dept Comp Sci, Aachen, Germany
来源
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年
关键词
feature extraction; multi-layer neural network; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the combination of different acoustic features. Several methods to combine these features such as concatenation or LDA are well known. Even though LDA improves the system, feature combination by LDA has been shown to be suboptimal. We introduce a new method based on neural networks. The posterior estimates derived from the NN lead to a significant improvement and achieve a 6% relative better word error rate (WER). Results are also compared to system combination. While system combination has been reported to outperform all other combination techniques, in this work the proposed NN-based combination outperforms system combination. We achieve a 2% relative better WER, resulting in an improvement of 7% relative to the baseline system. In addition to giving better recognition performance w.r.t. WER, NN-based combination reduces both, training and testing complexity. Overall, we use a single set of acoustic models, together with the training of the NN.
引用
收藏
页码:1244 / 1247
页数:4
相关论文
共 50 条
  • [41] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
    Farokhmanesh, Fatemeh
    Sadeghi, Mohammad Taghi
    NEURAL PROCESSING LETTERS, 2021, 53 (01) : 701 - 720
  • [42] Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
    Fatemeh Farokhmanesh
    Mohammad Taghi Sadeghi
    Neural Processing Letters, 2021, 53 : 701 - 720
  • [43] Feature selection with neural networks
    Verikas, A
    Bacauskiene, M
    PATTERN RECOGNITION LETTERS, 2002, 23 (11) : 1323 - 1335
  • [44] Neural networks for feature selection
    Pal, NR
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 1121 - 1124
  • [45] Feature Selection With Neural Networks
    Philippe Leray
    Patrick Gallinari
    Behaviormetrika, 1999, 26 (1) : 145 - 166
  • [46] LDA Based Feature Estimation Methods for LVCSR
    Pylkkoenen, Janne
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 389 - 392
  • [47] Investigation of acoustic modeling techniques for LVCSR systems
    Liu, X
    Gales, MJF
    Sim, KC
    Yu, K
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 849 - 852
  • [48] Acoustic Feature Prediction from Semantic Features for Expressive Speech using Deep Neural Networks
    Jauk, Igor
    Bonafonte, Antonio
    Pascual, Santiago
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 2320 - 2324
  • [49] Optimized combination of neural networks
    Benediktsson, JA
    Sveinsson, JR
    Ersoy, OK
    ISCAS 96: 1996 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS - CIRCUITS AND SYSTEMS CONNECTING THE WORLD, VOL 3, 1996, : 535 - 538
  • [50] A COMPARATIVE STUDY ON SYSTEM COMBINATION SCHEMES FOR LVCSR
    Ma, Chengyuan
    Kuo, Hong-Kwang Jeff
    Soltau, Hagen
    Cui, Xiaodong
    Chaudhari, Upendra
    Mangu, Lidia
    Lee, Chin-Hui
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4394 - 4397