Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition

被引:1
作者
Yoon, Ki-mu [1 ]
Kim, Wooil [1 ]
机构
[1] Incheon Natl Univ, Dept Comp Sci & Engn, 119 Acad RO, Incheon 22012, South Korea
来源
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA | 2019年 / 38卷 / 01期
关键词
Noise comiption function; DNN (Deep Neural Network); Model adaptation; Speech recognition; Noisy environments; COMPENSATION;
D O I
10.7776/ASK.2019.38.1.047
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).
引用
收藏
页码:47 / 50
页数:4
相关论文
共 9 条
  • [1] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [2] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06): : 1109 - 1121
  • [3] Robust continuous speech recognition using parallel model combination
    Gales, MJF
    Young, SJ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (05): : 352 - 359
  • [4] Han K, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P2484
  • [5] Hirsch H.G., 2000, ISCA ITRW ASR2000 AU, P29, DOI DOI 10.21437/ICSLP.2000-743
  • [6] Jun Du, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P1764, DOI 10.1109/ICASSP.2014.6853901
  • [7] Feature compensation in the cepstral domain employing model combination
    Kim, Wooil
    Hansen, John H. L.
    [J]. SPEECH COMMUNICATION, 2009, 51 (02) : 83 - 96
  • [8] MAXIMUM-LIKELIHOOD LINEAR-REGRESSION FOR SPEAKER ADAPTATION OF CONTINUOUS DENSITY HIDDEN MARKOV-MODELS
    LEGGETTER, CJ
    WOODLAND, PC
    [J]. COMPUTER SPEECH AND LANGUAGE, 1995, 9 (02) : 171 - 185
  • [9] Data-driven environmental compensation for speech recognition: A unified approach
    Moreno, PJ
    Raj, B
    Stern, RM
    [J]. SPEECH COMMUNICATION, 1998, 24 (04) : 267 - 285