Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator

被引:0
作者
Kuwalek, Piotr [1 ]
Jesko, Waldemar [2 ,3 ]
机构
[1] Poznan Univ Tech, Inst Elect Engn & Elect, PL-60965 Poznan, Poland
[2] Poznan Univ Tech, Inst Comp Sci, PL-60965 Poznan, Poland
[3] Poznan Supercomp & Networking Ctr, PL-61139 Poznan, Poland
关键词
adaptive thresholds; enhanced empirical wavelet transform; denoising; speech enhancement; Teager energy operator; LOW-RANK; NOISE; ROBUST; FRAMEWORK;
D O I
10.3390/electronics12143167
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a new speech-enhancement approach based on an enhanced empirical wavelet transform, considering the time and scale adaptation of thresholds for individual component signals obtained from the used transform. The time adaptation is performed using the Teager energy operator on the individual component signals, and the scale adaptation of thresholds is performed by the modified level-dependent threshold principle for the individual component signals. The proposed approach does not require an explicit estimation of the noise level or a priori knowledge of the signal-to-noise ratio as is usually needed in most common speech-enhancement methods. The effectiveness of the proposed method has been assessed based on over 1000 speech recordings from the public Librispeech database. The research included various types of noise (among others white, violet, brown, blue, and pink) and various types of disturbance (among others traffic sounds, hair dryer, and fan), which were added to the selected test signals. The score of perceptual evaluation of speech quality, allowing for the assessment of the quality of enhanced speech, and signal-to-noise ratio, allowing for the assessment of the effectiveness of disturbance attenuation, are selected for the evaluation of the resultant effectiveness of the proposed approach. The resultant effectiveness of the proposed approach is compared with other selected speech-enhancement methods or denoising techniques available in the literature. The experimental research results show that the proposed method performs better than conventional methods in many types of high-noise conditions in terms of producing less residual noise and lower speech distortion.
引用
收藏
页数:21
相关论文
共 59 条
[1]   Robust Speaker Identification Algorithms and Results in Noisy Environments [J].
Ayhan, Bulent ;
Kwan, Chiman .
ADVANCES IN NEURAL NETWORKS - ISNN 2018, 2018, 10878 :443-450
[2]   Wavelet speech enhancement based on time-scale adaptation [J].
Bahoura, Mohammed ;
Rouat, Jean .
SPEECH COMMUNICATION, 2006, 48 (12) :1620-1637
[3]  
Banaszek Andrzej, 2022, Procedia Computer Science, P388, DOI 10.1016/j.procs.2022.09.073
[4]  
Banaszek Andrzej, 2022, Procedia Computer Science, P398, DOI 10.1016/j.procs.2022.09.074
[5]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[6]   Evaluating five different adaptive decomposition methods for EEG signal seizure detection and classification [J].
Carvalho, Vinicius R. ;
Moraes, Marcio F. D. ;
Braga, Antonio P. ;
Mendes, Eduardo M. A. M. .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 62
[7]  
Casey O., 2021, P 2021 INT C COMP CO, P27, DOI [10.1109/ICCMA53594.2021.00013, DOI 10.1109/ICCMA53594.2021.00013]
[8]  
Cecko R., 2021, COMPUT METHODS SCI T, V27, P41, DOI [10.12921/cmst.2021.0000015, DOI 10.12921/CMST.2021.0000015]
[9]   Time-Frequency Masking Based Online Multi-Channel Speech Enhancement With Convolutional Recurrent Neural Networks [J].
Chakrabarty, Soumitro ;
Habets, Emanuel A. P. .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (04) :787-799
[10]  
Choudhury Arpita, 2023, 2023 International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC), P1, DOI 10.1109/ISACC56298.2023.10084209