Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition

被引:7
|
作者
Sun, Lei [1 ]
Du, Jun [2 ]
Xie, Zhipeng [3 ]
Xu, Yong [4 ]
机构
[1] Univ Sci & Technol China, 96 JinZhai Rd, Hefei, Anhui, Peoples R China
[2] Univ Sci & Technol China, iFlytek Speech Lab, 96 JinZhai Rd, Hefei, Anhui, Peoples R China
[3] iFlytek Co Ltd, iFlytek Res, Hefei, Anhui, Peoples R China
[4] Univ Surrey, Guildford GU2 7XH, Surrey, England
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2018年 / 90卷 / 07期
基金
中国国家自然科学基金;
关键词
Laser Doppler vibrometer; Auxiliary features; Deep neural network; Regression model; Speech recognition; NOISE;
D O I
10.1007/s11265-017-1287-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, the signals captured from a laser Doppler vibrometer (LDV) sensor have shown the noise robustness to automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. In this study, an alternative approach, namely concatenating the auxiliary features extracted from the LDV signal with the conventional acoustic features, is proposed to further improve ASR performance based on the deep neural network (DNN) for acoustic modeling. The preliminary experiments on a small set of stereo-data including both LDV and acoustic signals demonstrate its effectiveness. Thus, to leverage more existing large-scale speech databases, a regression DNN is designed to map acoustic features to LDV features, which is well trained from a stereo-data set with a limited size and then used to generate pseudo-LDV features from a massive speech data set for parallel training of an ASR system. Our experiments verify that both the features from the limited scale LDV data set as well as the massive scale pseudo-LDV features can yield significant improvements of recognition performance over the system using purely acoustic features, in both quiet and noisy environments.
引用
收藏
页码:975 / 983
页数:9
相关论文
共 50 条
  • [1] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
    Lei Sun
    Jun Du
    Zhipeng Xie
    Yong Xu
    Journal of Signal Processing Systems, 2018, 90 : 975 - 983
  • [2] Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor
    Xie, Zhipeng
    Du, Jun
    McLoughlin, Ian
    Xu, Yong
    Ma, Feng
    Wang, Haikun
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [3] Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer
    Cai, Chengkai
    Iwai, Kenta
    Nishiura, Takanobu
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [4] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
    You, Yongbin
    Qian, Yanmin
    Yu, Kai
    2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
  • [5] Indonesian speech recognition based on Deep Neural Network
    Yang, Ruolin
    Yang, Jian
    Lu, Yu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41
  • [6] Primi Speech Recognition Based on Deep Neural Network
    Hu, Wenjun
    Fu, Meijun
    Pan, Wenlin
    2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 667 - 671
  • [7] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
    Lee, Ming-Che
    Yeh, Sheng-Cheng
    Chang, Jia-Wei
    Chen, Zhen-Yi
    SENSORS, 2022, 22 (13)
  • [8] Donggan speech recognition based on deep neural network
    Xu, Haiyan
    Yang, Hongwu
    You, Yuren
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 354 - 358
  • [9] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
    Li, Guanjun
    Liang, Shan
    Nie, Shuai
    Liu, Wenju
    Yang, Zhanlei
    Xiao, Longshuai
    INTERSPEECH 2020, 2020, : 51 - 55
  • [10] Deep Q-network-based noise suppression for robust speech recognition
    Park, Tae-Jun
    Chang, Joon-Hyuk
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2362 - 2373