A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

被引:0
|
作者
Bo Wu
Minglei Yang
Kehuang Li
Zhen Huang
Sabato Marco Siniscalchi
Tong Wang
Chin-Hui Lee
机构
[1] Xidian University,National Laboratory of Radar Signal Processing
[2] School of Electrical and Computer Engineering,Department of Telecommunications
[3] Georgia Institute of Technology,undefined
[4] University of Enna Kore,undefined
来源
EURASIP Journal on Advances in Signal Processing | / 2017卷
关键词
Deep neural networks (DNNs); Simultaneous speech dereverberation and beamforming; Auto-correlation function; Temporal and spatial contexts; Reverberation-time-aware (RTA);
D O I
暂无
中图分类号
学科分类号
摘要
A reverberation-time-aware deep-neural-network (DNN)-based multi-channel speech dereverberation framework is proposed to handle a wide range of reverberation times (RT60s). There are three key steps in designing a robust system. First, to accomplish simultaneous speech dereverberation and beamforming, we propose a framework, namely DNNSpatial, by selectively concatenating log-power spectral (LPS) input features of reverberant speech from multiple microphones in an array and map them into the expected output LPS features of anechoic reference speech based on a single deep neural network (DNN). Next, the temporal auto-correlation function of received signals at different RT60s is investigated to show that RT60-dependent temporal-spatial contexts in feature selection are needed in the DNNSpatial training stage in order to optimize the system performance in diverse reverberant environments. Finally, the RT60 is estimated to select the proper temporal and spatial contexts before feeding the log-power spectrum features to the trained DNNs for speech dereverberation. The experimental evidence gathered in this study indicates that the proposed framework outperforms the state-of-the-art signal processing dereverberation algorithm weighted prediction error (WPE) and conventional DNNSpatial systems without taking the reverberation time into account, even for extremely weak and severe reverberant conditions. The proposed technique generalizes well to unseen room size, array geometry and loudspeaker position, and is robust to reverberation time estimation error.
引用
收藏
相关论文
共 23 条
  • [1] A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation
    Wu, Bo
    Yang, Minglei
    Li, Kehuang
    Huang, Zhen
    Siniscalchi, Sabato Marco
    Wang, Tong
    Lee, Chin-Hui
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2017,
  • [2] A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks
    Wu, Bo
    Li, Kehuang
    Yang, Minglei
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 102 - 111
  • [3] Speech dereverberation and noise reduction with a combined microphone array approach
    Gonzalez-Rodriguez, J
    Sanchez-Bote, JL
    Ortega-Garcia, J
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1037 - 1040
  • [4] Reverberation aware deep learning for environment tolerant microphone array DOA estimation
    Liu, Yuji
    Tong, Feng
    Zhong, Shuanglian
    Hong, Qingyang
    Li, Lin
    APPLIED ACOUSTICS, 2021, 184
  • [5] A Late Reverberation Power Spectral Density Aware Approach to Speech Dereverberation Based on Deep Neural Networks
    Qi, Yuanlei
    Yang, Feiran
    Yang, Jun
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1700 - 1703
  • [6] Microphone Array Speaker Localizers Using Spatial-Temporal Information
    Sharon Gannot
    Tsvi Gregory Dvorkind
    EURASIP Journal on Advances in Signal Processing, 2006
  • [7] Microphone array speaker localizers using spatial-temporal information
    Gannot, Sharon
    Dvorkind, Tsvi Gregory
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)
  • [8] Multiple Sound Source Localization, Separation, and Reconstruction by Microphone Array: A DNN-Based Approach
    Chen, Long
    Chen, Guitong
    Huang, Lei
    Choy, Yat-Sze
    Sun, Weize
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [9] Microphone array speaker localizers using spatial-temporal information
    Gannot, Sharon
    Dvorkind, Tsvi Gregory
    Eurasip Journal on Applied Signal Processing, 2006, 2006
  • [10] Time-Aligned Spatial Upsampling of Spherical Microphone Array Recordings
    Poerschmann, Christoph
    Luebeck, Tim
    Arend, Johannes M.
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2024, 72 (10): : 726 - 738