Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition

被引：0

作者：

Lei Sun

Jun Du

Zhipeng Xie

Yong Xu

机构：

[1] University of Science and Technology of China,

[2] iFlytek Research,undefined

[3] University of Surrey,undefined

来源：

Journal of Signal Processing Systems | 2018年 / 90卷

关键词：

Laser Doppler vibrometer; Auxiliary features; Deep neural network; Regression model; Speech recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recently, the signals captured from a laser Doppler vibrometer (LDV) sensor have shown the noise robustness to automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. In this study, an alternative approach, namely concatenating the auxiliary features extracted from the LDV signal with the conventional acoustic features, is proposed to further improve ASR performance based on the deep neural network (DNN) for acoustic modeling. The preliminary experiments on a small set of stereo-data including both LDV and acoustic signals demonstrate its effectiveness. Thus, to leverage more existing large-scale speech databases, a regression DNN is designed to map acoustic features to LDV features, which is well trained from a stereo-data set with a limited size and then used to generate pseudo-LDV features from a massive speech data set for parallel training of an ASR system. Our experiments verify that both the features from the limited scale LDV data set as well as the massive scale pseudo-LDV features can yield significant improvements of recognition performance over the system using purely acoustic features, in both quiet and noisy environments.

引用

页码：975 / 983

页数：8

共 42 条

[1]

Hinton G(2012)undefined IEEE Signal Processing Magazine 29 82-undefined

[2]

Deng L(2002)undefined Neural Computation 14 1771-undefined

[3]

Yu D(2006)undefined Neural Computation 18 1527-undefined

[4]

Dahl GE(2010)undefined Momentum 9 926-undefined

[5]

Mohamed AR(1995)undefined Speech Communication 16 261-undefined

[6]

Jaitly N(2014)undefined IEEE/ACM Transactions on Audio Speech, and Language Processing 22 745-undefined

[7]

Senior A(1998)undefined Speech Communication 25 117-undefined

[8]

Vanhoucke V(1988)undefined Journal of Applied Physics 64 3722-undefined

[9]

Nguyen P(2003)undefined IEEE Signal Processing Letters 10 72-undefined

[10]

Sainath TN(1996)undefined Otology & Neurotology 17 813-undefined

← 1 2 3 4 5 →