A Reduced Complexity MFCC-based Deep Neural Network Approach for Speech Enhancement

被引：0

作者：

Razani, Ryan ^{[1
]}

Chung, Hanwook ^{[1
]}

Attabi, Yazid ^{[1
]}

Champagne, Benoit ^{[1
]}

机构：

[1] McGill Univ, Dept Elect & Comp Engn, 3480 Univ St, Montreal, PQ, Canada

来源：

2017 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT) | 2017年

关键词：

Speech enhancement; deep learning; neural networks; low-complexity; MFCC; NOISE;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper focuses on a regression-based deep neural network (DNN) approach for single-channel speech enhancement. While DNN can lead to improved speech quality compared to classical approaches, it is afflicted by high computational complexity in the training stage. The main contribution of this work is to reduce the DNN complexity by introducing a spectral feature mapping from noisy mel frequency cepstral coefficients (MFCC) to enhanced short-time Fourier transform (STFT) spectrum. This approach requires much fewer input features and consequently lead to reduced DNN complexity. Exploiting the frequency domain speech features obtained from this mapping also avoids the information loss in reconstructing the speech signal back to time domain from its MFCC. Compared to the STFT-based DNN approach, the complexity of our approach for the training phase is reduced by a factor of 4.75. Moreover, experimental results of perceptual evaluation of speech quality (PESQ) and source-to-distortion ratio (SDR) show that the proposed approach outperforms the benchmark algorithms and this for various noise types, and different SNR levels.

引用

页码：331 / 336

页数：6

共 40 条

[1]

[Anonymous], EURASIP J ADV SIGNAL

[2]

[Anonymous], 2002, The HTK book

[3]

[Anonymous], 2015, Advances in Neural Information Processing Systems

[4]

[Anonymous], 1999, SPEECH COMMUNICATION

[5]

[Anonymous], 2012, Interspeech

[6]

[Anonymous], 2015, ARXIV150401482

[7]

[Anonymous], 2009, NEURAL NETWORKS LEAR

[8] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].

BOLL, SF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120

[9]

Chauhan Aneesh, 2015, 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS), P1, DOI 10.1109/EAIS.2015.7368779

[10] Discriminative Training of NMF Model Based on Class Probabilities for Speech Enhancement [J].

Chung, Hanwook ;

Plourde, Eric ;

Champagne, Benoit .

IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (04) :502-506

← 1 2 3 4 →