Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer

被引:2
作者
Cai, Chengkai [1 ]
Iwai, Kenta [2 ]
Nishiura, Takanobu [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Informat Sci & Engn, Kyoto 6038577, Japan
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kyoto 6038577, Japan
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 03期
关键词
distant-talking speech measurement; speech enhancement; deep neural network; laser Doppler vibrometer;
D O I
10.3390/app13031958
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The development of distant-talk measurement systems has been attracting attention since they can be applied to many situations such as security and disaster relief. One such system that uses a device called a laser Doppler vibrometer (LDV) to acquire sound by measuring an object's vibration caused by the sound source has been proposed. Different from traditional microphones, an LDV can pick up the target sound from a distance even in a noisy environment. However, the acquired sounds are greatly distorted due to the object's shape and frequency response. Due to the particularity of the degradation of observed speech, conventional methods cannot be effectively applied to LDVs. We propose two speech enhancement methods that are based on two-stage processing with deep neural networks for LDVs. With the first proposed method, the amplitude spectrum of the observed speech is first restored. The phase difference between the observed and clean speech is then estimated using the restored amplitude spectrum. With the other proposed method, the low-frequency components of the observed speech are first restored. The high-frequency components are then estimated by the restored low-frequency components. The evaluation results indicate that they improved the observed speech in sound quality, deterioration degree, and intelligibility.
引用
收藏
页数:15
相关论文
共 27 条
[1]  
[Anonymous], 1988, INT TELECOMMUN UNION
[2]  
Avargel Y., 2011, 2011 Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2011), P109, DOI 10.1109/HSCMA.2011.5942375
[3]   The quality and reliability of the mechanical stethoscopes and Laser Doppler Vibrometer (LDV) to record tracheal sounds [J].
Aygun, Haydar ;
Apolskis, Aleksejs .
APPLIED ACOUSTICS, 2020, 161
[4]   Identification of damage in plates using full-field measurement with a continuously scanning laser Doppler vibrometer system [J].
Chen, Da-Ming ;
Xu, Y. F. ;
Zhu, W. D. .
JOURNAL OF SOUND AND VIBRATION, 2018, 422 :542-567
[5]   AN ACOUSTIC LENS AS A DIRECTIONAL MICROPHONE [J].
CLARK, MA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (06) :1152-1153
[6]  
Dauphin YN, 2017, PR MACH LEARN RES, V70
[7]  
Garofolo J.S., 1993, ACOUSTIC PHONETIC CO, VVolume 93
[8]   SIGNAL ESTIMATION FROM MODIFIED SHORT-TIME FOURIER-TRANSFORM [J].
GRIFFIN, DW ;
LIM, JS .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (02) :236-243
[9]   Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1026-1034
[10]  
King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001