Deep Neural Network for Robust Speech Recognition With Auxiliary Features From Laser-Doppler Vibrometer Sensor

被引：0

作者：

Xie, Zhipeng ^{[1
]}

Du, Jun ^{[1
]}

McLoughlin, Ian ^{[2
]}

Xu, Yong ^{[3
]}

Ma, Feng ^{[3
]}

Wang, Haikun ^{[3
]}

机构：

[1] Univ Sci & Technol China, NELSLIP, Hefei, Anhui, Peoples R China

[2] Univ Kent, Sch Comp, Medway, England

[3] IFlytek Res, Hefei, Anhui, Peoples R China

来源：

2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2016年

关键词：

laser Doppler vibrometer; auxiliary features; deep neural network; regression model; speech recognition;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Recently, the signal captured from a laser Doppler vibrometer (LDV) sensor been used to improve the noise robustness automatic speech recognition (ASR) systems by enhancing the acoustic signal prior to feature extraction. This study proposes another approach in which auxiliary features extracted from the LDV signal are used alongside conventional acoustic features to further improve ASR performance based on the use of a deep neural network (DNN) as the acoustic model. While this approach is promising, the best training data sets for ASR do not include LDV data in parallel with the acoustic signal. Thus, to leverage such existing large-scale speech databases, a regression DNN is designed to map acoustic features to LDV features. This regression DNN is well trained from a limited size parallel signal data set, then used to form pseudo-LDV features from a massive speech data set for parallel training of an ASR system. Our experiments show that both the features from the limited scale LDV data set as well as the massive scale pseudo-LDV features are able to train an ASR system that significantly outperforms one using acoustic features alone, in both quiet and noisy environments.

引用

页数：5

共 50 条

[1] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
Sun, Lei
Du, Jun
Xie, Zhipeng
Xu, Yong
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 975 - 983
[2] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
Lei Sun
Jun Du
Zhipeng Xie
Yong Xu
Journal of Signal Processing Systems, 2018, 90 : 975 - 983
[3] Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer
Cai, Chengkai
Iwai, Kenta
Nishiura, Takanobu
APPLIED SCIENCES-BASEL, 2023, 13 (03):
[4] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
You, Yongbin
Qian, Yanmin
Yu, Kai
2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
[5] Primi Speech Recognition Based on Deep Neural Network
Hu, Wenjun
Fu, Meijun
Pan, Wenlin
2016 IEEE 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS (IS), 2016, : 667 - 671
[6] Indonesian speech recognition based on Deep Neural Network
Yang, Ruolin
Yang, Jian
Lu, Yu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 36 - 41
[7] Multiresolution Convolutional Neural Network For Robust Speech Recognition
Naderi, Navid
Nasersharif, Babak
2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1459 - 1464
[8] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
Lee, Ming-Che
Yeh, Sheng-Cheng
Chang, Jia-Wei
Chen, Zhen-Yi
SENSORS, 2022, 22 (13)
[9] Binaural Deep Neural Network for Robust Speech Enhancement
Jiang, Yi
Liu, Runsheng
2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 692 - 695
[10] Deep Neural Network-Based Generalized Sidelobe Canceller for Robust Multi-channel Speech Recognition
Li, Guanjun
Liang, Shan
Nie, Shuai
Liu, Wenju
Yang, Zhanlei
Xiao, Longshuai
INTERSPEECH 2020, 2020, : 51 - 55

← 1 2 3 4 5 →