MULTILAYER PERCEPTRON WITH SPARSE HIDDEN OUTPUTS FOR PHONEME RECOGNITION

被引:0
作者
Sivaram, G. S. V. S. [1 ]
Hermansky, Hynek [1 ]
机构
[1] Johns Hopkins Univ, Ctr Excellence, Ctr Language & Speech Proc, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
来源
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年
关键词
Multilayer perceptron; sparse features; machine learning; phoneme recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces the sparse multilayer perceptron (SMLP) which learns the transformation from the inputs to the targets as in multilayer perceptron (MLP) while the outputs of one of the internal hidden layers is forced to be sparse. This is achieved by adding a sparse regularization term to the cross-entropy cost and learning the parameters of the network to minimize the joint cost. On the TIMIT phoneme recognition task, the SMLP based system trained using perceptual linear prediction (PLP) features performs better than the conventional MLP based system. Furthermore, their combination yields a phoneme error rate of 21.2%, a relative improvement of 6.2% over the baseline.
引用
收藏
页码:5336 / 5339
页数:4
相关论文
共 15 条
  • [1] [Anonymous], 2006, Advances in Neural Information Processing Systems, DOI DOI 10.7551/MITPRESS/7503.001.0001
  • [2] [Anonymous], 2007, P 20 INT C NEURAL IN
  • [3] [Anonymous], 2008, P ADV NEURAL INFORM
  • [4] [Anonymous], 1994, Connectionist Speech Recognition: A Hybrid Approach
  • [5] PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH
    HERMANSKY, H
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) : 1738 - 1752
  • [6] Hoyer PO, 2004, J MACH LEARN RES, V5, P1457
  • [7] SPEAKER-INDEPENDENT PHONE RECOGNITION USING HIDDEN MARKOV-MODELS
    LEE, KF
    HON, HW
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1989, 37 (11): : 1641 - 1648
  • [8] Sparse coding with an overcomplete basis set: A strategy employed by V1?
    Olshausen, BA
    Field, DJ
    [J]. VISION RESEARCH, 1997, 37 (23) : 3311 - 3325
  • [9] Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator
    Pinto, Joel
    Garimella, Sivaram
    Magimai-Doss, Mathew
    Hermansky, Hynek
    Bourlard, Herve
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 225 - 241
  • [10] Neural Network Classifiers Estimate Bayesian a posteriori Probabilities
    Richard, Michael D.
    Lippmann, Richard P.
    [J]. NEURAL COMPUTATION, 1991, 3 (04) : 461 - 483