Multi-task Learning Deep Neural Networks For Speech Feature Denoising

被引:0
|
作者
Huang, Bin [1 ]
Ke, Dengfeng [2 ]
Zheng, Hao [2 ]
Xu, Bo [2 ]
Xu, Yanyan [1 ]
Su, Kaile [3 ]
机构
[1] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[3] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld, Australia
关键词
multi-task learning; feature denoising; deep neural networks; ENHANCEMENT;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Traditional automatic speech recognition (ASR) systems usually get a sharp performance drop when noise presents in speech. To make a robust ASR, we introduce a new model using the multi-task learning deep neural networks (MTL-DNN) to solve the speech denoising task in feature level. In this model, the networks are initialized by pre-training restricted Boltzmann machines (RBM) and fine-tuned by jointly learning multiple interactive tasks using a shared representation. In multi-task learning, we choose a noisy-clean speech pair fitting task as the primary task and separately explore two constraints as the secondary tasks: phone label and phone cluster. In experiments, the denoised speech is reconstructed by the MTL-DNN using the noisy speech as input and it is respectively evaluated by the DNN-hidden Markov model (HMM) based and the Gaussian Mixture Model (GMM)-HMM based ASR systems. Results show that, using the denoised speech, the word error rate (WER) is respectively reduced by 53.14% and 34.84% compared with baselines. The MTL-DNN model also outperforms the general single-task learning deep neural networks (STL-DNN) model with a performance improvement of 4.93% and 3.88% respectively.
引用
收藏
页码:2464 / 2468
页数:5
相关论文
共 50 条
  • [21] Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks
    Zhang, Xi-Jin
    Lu, Yi-Fan
    Zhang, Song-Hai
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2016, 31 (03) : 489 - 500
  • [22] Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks
    Xi-Jin Zhang
    Yi-Fan Lu
    Song-Hai Zhang
    Journal of Computer Science and Technology, 2016, 31 : 489 - 500
  • [23] A deep neural network based multi-task learning approach to hate speech detection
    Kapil, Prashant
    Ekbal, Asif
    KNOWLEDGE-BASED SYSTEMS, 2020, 210 (210)
  • [24] Deep Multi-task Augmented Feature Learning via Hierarchical Graph Neural Network
    Guo, Pengxin
    Deng, Chang
    Xu, Linjie
    Huang, Xiaonan
    Zhang, Yu
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 538 - 553
  • [25] Attribute Knowledge Integration for Speech Recognition Based on Multi-task Learning Neural Networks
    Zheng, Hao
    Yang, Zhanlei
    Qiao, Liwei
    Li, Jianping
    Liu, Wenju
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 543 - 547
  • [26] MULTI-LINGUAL SPEECH RECOGNITION WITH LOW-RANK MULTI-TASK DEEP NEURAL NETWORKS
    Mohan, Aanchan
    Rose, Richard
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4994 - 4998
  • [27] Optical multi-task learning using multi-wavelength diffractive deep neural networks
    Duan, Zhengyang
    Chen, Hang
    Lin, Xing
    NANOPHOTONICS, 2023, 12 (05) : 893 - 903
  • [28] Empirical evaluation of multi-task learning in deep neural networks for natural language processing
    Jianquan Li
    Xiaokang Liu
    Wenpeng Yin
    Min Yang
    Liqun Ma
    Yaohong Jin
    Neural Computing and Applications, 2021, 33 : 4417 - 4428
  • [29] Multi-task learning for the prediction of wind power ramp events with deep neural networks
    Dorado-Moreno, M.
    Navarin, N.
    Gutierrez, P. A.
    Prieto, L.
    Sperduti, A.
    Salcedo-Sanz, S.
    Hervas-Martinez, C.
    NEURAL NETWORKS, 2020, 123 : 401 - 411
  • [30] Convolutional Neural Networks Based Multi-task Deep Learning for Movie Review Classification
    Li, Xuanyi
    Wu, Weimin
    Su, Hongye
    2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2017, : 382 - 388