DRC-NET: DENSELY CONNECTED RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR SPEECH DEREVERBERATION

被引：10

作者：

Liu, Jinjiang ^{[1
]}

Zhang, Xueliang ^{[1
]}

机构：

[1] Inner Mongolia Univ, Coll Comp Sci, Hohhot, Peoples R China

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

基金：

中国国家自然科学基金;

关键词：

speech dereverberation; microphone array processing; convolutional recurrent neural network; deep learning; DOMAIN;

D O I：

10.1109/ICASSP43922.2022.9747111

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Under our previous work on frequency bin-wise independent processing, a dramatic reduction of the computational complexity for recurrent neural networks (RNN) is achieved. So that a massive deployment of RNN in time dimension is realized in this paper, by using the channel-wise long short-term memory neural network. Based on this approach, the processing of RNN on frequency dimension and time dimension in the time-frequency domain are unified. This allows us to combine convolutional neural network (CNN) and RNN as a basic neural operator, which finally leads to the Densely Connected Recurrent Convolutional Neural Network (DRC-NET). The DRC-NET sufficiently exploits the infinite response of RNN, and the finite response of CNN. Its balanced response characteristics significantly improve the system performance. Experimental result shows that both non-causal and causal version of DRC-NET outperforms the state-of-the-art (STOA) model for speech dereverberation task.

引用

页码：166 / 170

页数：5

共 31 条

[1] REAL-TIME DENOISING AND DEREVERBERATION WTIH TINY RECURRENT U-NET
Choi, Hyeong-Seok
Park, Sungjin
Lee, Jie Hwan
Heo, Hoon
Jeon, Dongsuk
Lee, Kyogu
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5789 - 5793
[2] Delcroix M., 2014, Proceedings of REVERB Challenge Workshop
[3] Fransen J., 1994, WSJCAM0 CORPUS RECOR, V192
[4] Habets E. A., 2006, ROOM IMPULSE RESPONS, V2, P1
[5] New Insights Into the MVDR Beamformer in Room Acoustics
Habets, E. A. P.
Benesty, J.
Cohen, I.
Gannot, S.
Dmochowski, J.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (01): : 158 - 170
[6] DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Hu, Yanxin
Liu, Yun
Lv, Shubo
Xing, Mengtao
Zhang, Shimin
Fu, Yihui
Wu, Jian
Zhang, Bihong
Xie, Lei
[J]. INTERSPEECH 2020, 2020, : 2472 - 2476
[7] Densely Connected Convolutional Networks
Huang, Gao
Liu, Zhuang
van der Maaten, Laurens
Weinberger, Kilian Q.
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269
[8] An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers
Jensen, Jesper
Taal, Cees H.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2009 - 2022
[9] A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Kinoshita, Keisuke
Delcroix, Marc
Gannot, Sharon
Habets, Emanuel A. P.
Haeb-Umbach, Reinhold
Kellermann, Walter
Leutnant, Volker
Maas, Roland
Nakatani, Tomohiro
Raj, Bhiksha
Sehr, Armin
Yoshioka, Takuya
[J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2016, : 1 - 19
[10] Le Roux J, 2019, INT CONF ACOUST SPEE, P626, DOI 10.1109/ICASSP.2019.8683855

← 1 2 3 4 →