Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition

被引:7
|
作者
Sun, Lei [1 ]
Du, Jun [2 ]
Xie, Zhipeng [3 ]
Xu, Yong [4 ]
机构
[1] Univ Sci & Technol China, 96 JinZhai Rd, Hefei, Anhui, Peoples R China
[2] Univ Sci & Technol China, iFlytek Speech Lab, 96 JinZhai Rd, Hefei, Anhui, Peoples R China
[3] iFlytek Co Ltd, iFlytek Res, Hefei, Anhui, Peoples R China
[4] Univ Surrey, Guildford GU2 7XH, Surrey, England
来源
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY | 2018年 / 90卷 / 07期
基金
中国国家自然科学基金;
关键词
Laser Doppler vibrometer; Auxiliary features; Deep neural network; Regression model; Speech recognition; NOISE;
D O I
10.1007/s11265-017-1287-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, the signals captured from a laser Doppler vibrometer (LDV) sensor have shown the noise robustness to automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. In this study, an alternative approach, namely concatenating the auxiliary features extracted from the LDV signal with the conventional acoustic features, is proposed to further improve ASR performance based on the deep neural network (DNN) for acoustic modeling. The preliminary experiments on a small set of stereo-data including both LDV and acoustic signals demonstrate its effectiveness. Thus, to leverage more existing large-scale speech databases, a regression DNN is designed to map acoustic features to LDV features, which is well trained from a stereo-data set with a limited size and then used to generate pseudo-LDV features from a massive speech data set for parallel training of an ASR system. Our experiments verify that both the features from the limited scale LDV data set as well as the massive scale pseudo-LDV features can yield significant improvements of recognition performance over the system using purely acoustic features, in both quiet and noisy environments.
引用
收藏
页码:975 / 983
页数:9
相关论文
共 50 条
  • [11] Deep Q-network-based noise suppression for robust speech recognition
    Park T.-J.
    Chang J.-H.
    Turkish Journal of Electrical Engineering and Computer Sciences, 2021, 25 (09) : 2362 - 2373
  • [12] Neural Network Based Recognition of Speech Using MFCC Features
    Barua, Pialy
    Ahmad, Kanij
    Khan, Ainul Anam Shahjamal
    Sanaullah, Muhammad
    2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
  • [13] Predominant Instrument Recognition Based on Deep Neural Network With Auxiliary Classification
    Yu, Dongyan
    Duan, Huiping
    Fang, Jun
    Zeng, Bing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 852 - 861
  • [14] An Improved Tibetan Lhasa Speech Recognition Method Based on Deep Neural Network
    Ruan, Wenbin
    Gan, Zhenye
    Liu, Bin
    Guo, Yin
    2017 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2017), 2017, : 303 - 306
  • [15] Deep Neural Network Based Speech Recognition Systems Under Noise Perturbations
    An, Qiyuan
    Bai, Kangjun
    Zhang, Moqi
    Yi, Yang
    Liu, Yifang
    PROCEEDINGS OF THE TWENTYFIRST INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2020), 2020, : 377 - 382
  • [16] Robust binaural speech separation in adverse conditions based on deep neural network with modified spatial features and training target
    Dadvar, Paria
    Geravanchizadeh, Masoud
    SPEECH COMMUNICATION, 2019, 108 : 41 - 52
  • [17] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
  • [18] Speech enhancement from fused features based on deep neural network and gated recurrent unit network
    Wang, Youming
    Han, Jiali
    Zhang, Tianqi
    Qing, Didi
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [19] Speech enhancement from fused features based on deep neural network and gated recurrent unit network
    Youming Wang
    Jiali Han
    Tianqi Zhang
    Didi Qing
    EURASIP Journal on Advances in Signal Processing, 2021
  • [20] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397