Speech Enhancement based on Deep Convolutional Neural Network

被引：2

作者：

Nuthakki, Ramesh ^{[1
]}

Masanta, Payel ^{[1
]}

Yukta, T. N. ^{[1
]}

机构：

[1] Atria Inst Technol, Dept Elect & Commun Engn, Bangalore, Karnataka, India

来源：

PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021) | 2021年

关键词：

Speech Enhancement; Deep Convolutional Neural Networks; Loss Functions; Speech Intelligibility; Harris Hawks Optimization; Coherence Speech Intelligibility Index; Estimated Short Time Objective Intelligibility; Mean Square Error; Perceptual Evaluation of Speech Quality; MEAN-SQUARE ERROR; INTELLIGIBILITY; ALGORITHM;

D O I：

10.1109/I-SMAC52330.2021.9640736

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech enhancement is the process of treating noisy speech signals so as to improve human perception as well as improve system understanding of the signal. For speech signals with medium or high signal to noise ratio (SNR), the aim is to produce subjectively pragmatic signal, and for signals having low SNR the aim is to reduce the noise while still maintaining the intelligibility. Many noise reduction algorithms improve overall speech quality but little progress has been made to improve the overall speech intelligibility. This paper proposes a deep convolutional neural network (DCNN) speech enhancement method by enhancing loss function such as extended short time objective ineligibility (ESTOI) and mean square error (MSE). These loss functions are improved using Harris Hawks Optimization (HHO). The enhanced speech signal is acquired by separating the clean speech signal from the noisy speech signal. By using various predictive measure of objective speech intelligibility like short time objective intelligibility, source to artefact ratio (SAR), coherence speech intelligibility index (CS II) and source to distortion ratio (SDR), the efficacy of speech enhancement is calculated. The quality of the enhanced speech signal is assessed using the quality measure such as speech distortion (SD) and perceptual evaluation of speech quality (PES Q).

引用

页码：770 / 775

页数：6

共 23 条

[1] Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises
Chen, Jitong
Wang, Yuxuan
Yoho, Sarah E.
Wang, DeLiang
Healy, Eric W.
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 139 (05) : 2604 - 2612
[2] Chen Z, 2017, INT CONF ACOUST SPEE, P246, DOI 10.1109/ICASSP.2017.7952155
[3] Chung H, 2018, IEEE W SP LANG TECH, P1, DOI [10.1109/CACS.2018.8606774, 10.1109/SLT.2018.8639524]
[4] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
EPHRAIM, Y
MALAH, D
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445
[5] Erdogan Hakan, 2017, New Era for Robust SpeechRecognition: ExploitingDeep Learning, P165
[6] An enhanced productivity prediction model of active solar still using artificial neural network and Harris Hawks optimizer
Essa, F. A.
Abd Elaziz, Mohamed
Elsheikh, Ammar H.
[J]. APPLIED THERMAL ENGINEERING, 2020, 170 (170)
[7] End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks
Fu, Szu-Wei
Wang, Tao-Wei
Tsao, Yu
Lu, Xugang
Kawai, Hisashi
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1570 - 1584
[8] Subjective Evaluation of a Noise-Reduced Training Target for Deep Neural Network-Based Speech Enhancement
Gelderblom, Femke B.
Tronstad, Tron, V
Viggen, Erlend Magnus
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 583 - 594
[9] A new perceptually weighted cost function in deep neural network based speech enhancement systems
Goli, Peyman
[J]. HEARING BALANCE AND COMMUNICATION, 2019, 17 (03) : 191 - 196
[10] An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type
Healy, Eric W.
Yoho, Sarah E.
Chen, Jitong
Wang, Yuxuan
Wang, DeLiang
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 138 (03) : 1660 - 1669

← 1 2 3 →