SHO based Deep Residual network and hierarchical speech features for speech enhancement

被引：0

作者：

Bhosle M.R. ^{[1
,2
]}

Narayaswamy N.K. ^{[2
,3
]}

机构：

[1] Electronics and Communication Engineering, Government Engineering College, Raichur

[2] Visvesvaraya Technological University, Karnataka, Belagavi

[3] Department of ECE, Nagarjuna College of Engineering and Technology, Bangalore

来源：

International Journal of Speech Technology | 2023年 / 26卷 / 02期

关键词：

Bark Frequency Cepstral Coefficients; Deep residual network; Harmony search optimization algorithm; Shuffled Shepherd Optimization Algorithm; Speech enhancement;

D O I：

10.1007/s10772-022-09972-x

中图分类号：

学科分类号：

摘要：

The human frequently finds difficulty in understanding the speech due to the real-world noises. The presence of external noises corrupts the listening comfort of user. Hence there is a need for the enhancement of speech. In this paper, the Shepherd Harmony Optimization (SHO)-based Deep Residual network (DRN) is developed for speech enhancement. Here, the developed SHO-based DRN is the combination of the Shuffled Shepherd Optimization Algorithm (SSOA) and Harmony Search optimization (HS). The Hanning window is used for the pre-processing of the input data. In this method, the Bark Frequency Cepstral Coefficients (BFCC) and Fractional Delta amplitude modulation spectrogram (FD-AMS) are used for the feature extraction. Moreover, the noises present in speech signals are predicted for eliminating the distorted noises and external calamities. Besides, the DRN classifier is utilized to improve the speech signal. The classifier is trained by newly devised optimization algorithm. Besides, the developed speech enhancement technique obtained better performance in terms of Perceptual Evaluation of Speech Quality (PESQ) with 2.646 and Root Mean Square Error (RMSE) with 0.0067. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

引用

页码：355 / 370

页数：15

共 50 条

[31] Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition
Vachhani, Bhavik
Bhat, Chitralekha
Das, Biswajit
Kopparapu, Sunil Kumar
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1854 - 1858
[32] The Application of Deep Neural Network in Speech Enhancement Processing
Chen Jian-ming
Liang Zhi-cheng
2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 1263 - 1266
[33] Speech Enhancement Using NMF based on Hierarchical Deep Neural Networks with Joint Learning
Mirjalili, Mohammad Mahdi
Seyedin, Sanaz
2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 1411 - 1415
[34] Speech Enhancement via Residual Dense Generative Adversarial Network
Zhou, Lin
Zhong, Qiuyue
Wang, Tianyi
Lu, Siyuan
Hu, Hongmei
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
[35] Speech Enhancement Method Based On LSTM Neural Network for Speech Recognition
Liu, Ming
Wang, Yujun
Wang, Jin
Wang, Jing
Xie, Xiang
PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 245 - 249
[36] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
Shi, Wenhua
Zhang, Xiongwei
Zou, Xia
Sun, Meng
Han, Wei
Li, Li
Min, Gang
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
[37] Speech Enhancement Based on Deep Denoising Autoencoder
Lu, Xugang
Tsao, Yu
Matsuda, Shigeki
Hori, Chiori
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 436 - 440
[38] Speech enhancement based on perceptually comfortable residual noise
Shin, Jong Won
Chang, Joon-Hyuk
Kim, Nam Soo
IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (11) : 3323 - 3326
[39] An improved wavelet-based speech enhancement by using speech signal features
Ayat, Saeed
Manzuri-Shalmani, M. T.
Dianat, Roohollah
COMPUTERS & ELECTRICAL ENGINEERING, 2006, 32 (06) : 411 - 425
[40] Broad Phoneme Class Specific Deep Neural Network Based Speech Enhancement
Karjol, Pavan
Ghosh, Prasanta Kumar
2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 372 - 376

← 1 2 3 4 5 →