Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor

被引:0
|
作者
Xie, Zhipeng [1 ]
Du, Jun [1 ]
McLoughlin, Ian [2 ]
Xu, Yong [3 ]
Ma, Feng [3 ]
Wang, Haikun [3 ]
机构
[1] Univ Sci & Technol China, NELSLIP, Hefei, Anhui, Peoples R China
[2] Univ Kent, Sch Comp, Medway, England
[3] IFlytek Res, Hefei, Anhui, Peoples R China
来源
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2016年
关键词
laser Doppler vibrometer; auxiliary features; deep neural network; regression model; speech recognition;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recently, the signal captured from a laser Doppler vibrometer (LDV) sensor been used to improve the noise robustness automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. This study proposes another approach in which auxiliary features extracted from the LDV signal are used alongside conventional acoustic features to further improve ASR performance based on the use of a deep neural network (DNN) as the acoustic model. While this approach is promising, the best training data sets for ASR do not include LDV data in parallel with the acoustic signal. Thus, to leverage such existing large-scale speech databases, a regression DNN is designed to map acoustic features to LDV features. This regression DNN is well trained from a limited size parallel signal data set, then used to form pseudo-LDV features from a massive speech data set for parallel training of an ASR system. Our experiments show that both the features from the limited scale LDV data set as well as the massive scale pseudo-LDV features are able to train an ASR system that significantly outperforms one using acoustic features alone, in both quiet and noisy environments.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
    Sun, Lei
    Du, Jun
    Xie, Zhipeng
    Xu, Yong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 975 - 983
  • [2] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
    Lei Sun
    Jun Du
    Zhipeng Xie
    Yong Xu
    Journal of Signal Processing Systems, 2018, 90 : 975 - 983
  • [3] Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer
    Cai, Chengkai
    Iwai, Kenta
    Nishiura, Takanobu
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [4] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
    You, Yongbin
    Qian, Yanmin
    Yu, Kai
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
  • [5] Primi Speech Recognition Based on Deep Neural Network
    Hu, Wenjun
    Fu, Meijun
    Pan, Wenlin
    2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 667 - 671
  • [6] Indonesian speech recognition based on Deep Neural Network
    Yang, Ruolin
    Yang, Jian
    Lu, Yu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41
  • [7] Multiresolution Convolutional Neural Network For Robust Speech Recognition
    Naderi, Navid
    Nasersharif, Babak
    2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1459 - 1464
  • [8] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
    Lee, Ming-Che
    Yeh, Sheng-Cheng
    Chang, Jia-Wei
    Chen, Zhen-Yi
    SENSORS, 2022, 22 (13)
  • [9] Binaural Deep Neural Network for Robust Speech Enhancement
    Jiang, Yi
    Liu, Runsheng
    2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 692 - 695
  • [10] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    Xiao, Longshuai
    INTERSPEECH 2020, 2020, : 51 - 55