A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

被引：0

作者：

Bo Wu

Minglei Yang

Kehuang Li

Zhen Huang

Sabato Marco Siniscalchi

Tong Wang

Chin-Hui Lee

机构：

[1] Xidian University,National Laboratory of Radar Signal Processing

[2] School of Electrical and Computer Engineering,Department of Telecommunications

[3] Georgia Institute of Technology,undefined

[4] University of Enna Kore,undefined

来源：

EURASIP Journal on Advances in Signal Processing | / 2017卷

关键词：

Deep neural networks (DNNs); Simultaneous speech dereverberation and beamforming; Auto-correlation function; Temporal and spatial contexts; Reverberation-time-aware (RTA);

D O I：

暂无

中图分类号：

学科分类号：

摘要：

A reverberation-time-aware deep-neural-network (DNN)-based multi-channel speech dereverberation framework is proposed to handle a wide range of reverberation times (RT60s). There are three key steps in designing a robust system. First, to accomplish simultaneous speech dereverberation and beamforming, we propose a framework, namely DNNSpatial, by selectively concatenating log-power spectral (LPS) input features of reverberant speech from multiple microphones in an array and map them into the expected output LPS features of anechoic reference speech based on a single deep neural network (DNN). Next, the temporal auto-correlation function of received signals at different RT60s is investigated to show that RT60-dependent temporal-spatial contexts in feature selection are needed in the DNNSpatial training stage in order to optimize the system performance in diverse reverberant environments. Finally, the RT60 is estimated to select the proper temporal and spatial contexts before feeding the log-power spectrum features to the trained DNNs for speech dereverberation. The experimental evidence gathered in this study indicates that the proposed framework outperforms the state-of-the-art signal processing dereverberation algorithm weighted prediction error (WPE) and conventional DNNSpatial systems without taking the reverberation time into account, even for extremely weak and severe reverberant conditions. The proposed technique generalizes well to unseen room size, array geometry and loudspeaker position, and is robust to reverberation time estimation error.

引用

共 23 条

[1] A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation
Wu, Bo
Yang, Minglei
Li, Kehuang
Huang, Zhen
Siniscalchi, Sabato Marco
Wang, Tong
Lee, Chin-Hui
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2017,
[2] A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks
Wu, Bo
Li, Kehuang
Yang, Minglei
Lee, Chin-Hui
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 102 - 111
[3] Speech dereverberation and noise reduction with a combined microphone array approach
Gonzalez-Rodriguez, J
Sanchez-Bote, JL
Ortega-Garcia, J
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1037 - 1040
[4] Reverberation aware deep learning for environment tolerant microphone array DOA estimation
Liu, Yuji
Tong, Feng
Zhong, Shuanglian
Hong, Qingyang
Li, Lin
APPLIED ACOUSTICS, 2021, 184
[5] A Late Reverberation Power Spectral Density Aware Approach to Speech Dereverberation Based on Deep Neural Networks
Qi, Yuanlei
Yang, Feiran
Yang, Jun
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1700 - 1703
[6] Microphone Array Speaker Localizers Using Spatial-Temporal Information
Sharon Gannot
Tsvi Gregory Dvorkind
EURASIP Journal on Advances in Signal Processing, 2006
[7] Microphone array speaker localizers using spatial-temporal information
Gannot, Sharon
Dvorkind, Tsvi Gregory
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)
[8] Multiple Sound Source Localization, Separation, and Reconstruction by Microphone Array: A DNN-Based Approach
Chen, Long
Chen, Guitong
Huang, Lei
Choy, Yat-Sze
Sun, Weize
APPLIED SCIENCES-BASEL, 2022, 12 (07):
[9] Microphone array speaker localizers using spatial-temporal information
Gannot, Sharon
Dvorkind, Tsvi Gregory
Eurasip Journal on Applied Signal Processing, 2006, 2006
[10] Time-Aligned Spatial Upsampling of Spherical Microphone Array Recordings
Poerschmann, Christoph
Luebeck, Tim
Arend, Johannes M.
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2024, 72 (10): : 726 - 738

← 1 2 3 →