DEEP CONVOLUTIONAL NEURAL NETWORK-BASED INVERSE FILTERING APPROACH FOR SPEECH DE-REVERBERATION

被引：4

作者：

Chung, Hanwook ^{[1
,2
]}

Tomar, Vikrant Singh ^{[2
]}

Champagne, Benoit ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada

[2] Fluent Ai, Montreal, PQ, Canada

来源：

PROCEEDINGS OF THE 2020 IEEE 30TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP) | 2020年

基金：

加拿大自然科学与工程研究理事会;

关键词：

single-channel speech de-reverberation; inverse filtering; convolutive transfer function; deep convolutional neural network; U-net; DEREVERBERATION; ALGORITHM; MASKING;

D O I：

10.1109/mlsp49062.2020.9231707

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we introduce a spectral-domain inverse filtering approach for single-channel speech de-reverberation using deep convolutional neural network (CNN). The main goal is to better handle realistic reverberant conditions where the room impulse response (RIR) filter is longer than the short-time Fourier transform (STFT) analysis window. To this end, we consider the convolutive transfer function (CTF) model for the reverberant speech signal. In the proposed framework, the CNN architecture is trained to directly estimate the inverse filter of the CTF model. Among various choices for the CNN structure, we consider the U-net which consists of a fully-convolutional auto-encoder network with skip-connections. Experimental results show that the proposed method provides better dereverberation performance than the prevalent benchmark algorithms under various reverberation conditions.

引用

页数：6

共 35 条

[1]

[Anonymous], 1987, Speech Communications: Human and Machine

[2]

[Anonymous], 2006, 24 TU EINDH

[3]

[Anonymous], 2015, ACS SYM SER

[4]

[Anonymous], TIMIT ACOUSTIC PHONE

[5] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[6]

Choi H.-S., 2019, P INT C LEARN REPR I

[7]

Chung H., 2018, IEEE INT WORKS MACH

[8]

Deng L, 2013, INT CONF ACOUST SPEE, P8599, DOI 10.1109/ICASSP.2013.6639344

[9]

Ernst O, 2018, EUR SIGNAL PR CONF, P390, DOI 10.23919/EUSIPCO.2018.8553141

[10] Learning Spectral Mapping for Speech Dereverberation and Denoising [J].

Han, Kun ;

Wang, Yuxuan ;

Wang, DeLiang ;

Woods, William S. ;

Merks, Ivo ;

Zhang, Tao .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (06) :982-992

← 1 2 3 4 →