A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments

被引：0

作者：

Wang, Heming ^{[1
]}

Pandey, Ashutosh ^{[1
]}

Wang, Deliang ^{[2
]}

机构：

[1] Ohio State Univ, 281 Lane Ave, Columbus, OH 43210 USA

[2] Ctr Cognit & Brain Sci, 1835 Neil Ave, Columbus, OH 43210 USA

来源：

COMPUTER SPEECH AND LANGUAGE | 2025年 / 89卷

关键词：

Speech enhancement; Speech dereverberation; Self-attention; ARN; DC-CRN; NEURAL-NETWORK; DEREVERBERATION; IDENTIFICATION; RECOGNITION;

D O I：

10.1016/j.csl.2024.101677

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning has led to dramatic performance improvements for the task of speech enhancement, where deep neural networks (DNNs) are trained to recover clean speech from noisy and reverberant mixtures. Most of the existing DNN-based algorithms operate in the frequency domain, as time -domain approaches are believed to be less effective for speech dereverberation. In this study, we employ two DNNs: ARN (attentive recurrent network) and DC-CRN (densely -connected convolutional recurrent network), and systematically investigate the effects of different components on enhancement performance, such as window sizes, loss functions, and feature representations. We conduct evaluation experiments in two main conditions: reverberant -only and reverberant -noisy. Our findings suggest that incorporating larger window sizes is helpful for dereverberation, and adding transform operations (either convolutional or linear) to encode and decode waveform features improves the sparsity of the learned representations, and boosts the performance of time -domain models. Experimental results demonstrate that ARN and DC-CRN with proposed techniques achieve superior performance compared with other strong enhancement baselines.

引用

页数：12

共 50 条

[21] MAXIMUM LIKELIHOOD PSD ESTIMATION FOR SPEECH ENHANCEMENT IN REVERBERANT AND NOISY CONDITIONS
Kuklasinski, Adam
Doclo, Simon
Jensen, Jesper
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 599 - 603
[22] A STUDY ON JOINT BEAMFORMING AND SPECTRAL ENHANCEMENT FOR ROBUST SPEECH RECOGNITION IN REVERBERANT ENVIRONMENTS
Xiong, Feifei
Meyer, Bernd T.
Goetze, Stefan
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5043 - 5047
[23] Noisy-reverberant Speech Enhancement Using DenseUNet with Time-frequency Attention
Zhao, Yan
Wang, DeLiang
INTERSPEECH 2020, 2020, : 3261 - 3265
[24] MULTICHANNEL SPEECH ENHANCEMENT USING CONVOLUTIVE TRANSFER FUNCTION APPROXIMATION IN REVERBERANT ENVIRONMENTS
Talmon, Ronen
Cohen, Israel
Gannot, Sharon
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3885 - +
[25] ENHANCEMENT OF REVERBERANT SPEECH USING THE CELP POSTFILTER
Jeub, Marco
Vary, Peter
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3993 - 3996
[26] Design of the Wiener gain in noisy and reverberant environments
Xiang, Qian
Chen, Jingdong
Benesty, Jacob
Lei, Tao
Pan, Chao
APPLIED ACOUSTICS, 2025, 231
[27] Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement
Zhao, Yan
Wang, Zhong-Qiu
Wang, DeLiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 53 - 62
[28] ON DNN POSTERIOR PROBABILITY COMBINATION IN MULTI-STREAM SPEECH RECOGNITION FOR REVERBERANT ENVIRONMENTS
Xiong, Feifei
Goetze, Stefan
Meyer, Bernd T.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5250 - 5254
[29] Intelligibility of reverberant noisy speech with ideal binary masking
Roman, Nicoleta
Woodruff, John
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2011, 130 (04) : 2153 - 2161
[30] Speech detection and enhancement using single microphone for distant speech applications in reverberant environments
Kothapally, Vinay
Hansen, John H. L.
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1948 - 1952

← 1 2 3 4 5 →