SHO based Deep Residual network and hierarchical speech features for speech enhancement

被引:0
|
作者
Bhosle M.R. [1 ,2 ]
Narayaswamy N.K. [2 ,3 ]
机构
[1] Electronics and Communication Engineering, Government Engineering College, Raichur
[2] Visvesvaraya Technological University, Karnataka, Belagavi
[3] Department of ECE, Nagarjuna College of Engineering and Technology, Bangalore
关键词
Bark Frequency Cepstral Coefficients; Deep residual network; Harmony search optimization algorithm; Shuffled Shepherd Optimization Algorithm; Speech enhancement;
D O I
10.1007/s10772-022-09972-x
中图分类号
学科分类号
摘要
The human frequently finds difficulty in understanding the speech due to the real-world noises. The presence of external noises corrupts the listening comfort of user. Hence there is a need for the enhancement of speech. In this paper, the Shepherd Harmony Optimization (SHO)-based Deep Residual network (DRN) is developed for speech enhancement. Here, the developed SHO-based DRN is the combination of the Shuffled Shepherd Optimization Algorithm (SSOA) and Harmony Search optimization (HS). The Hanning window is used for the pre-processing of the input data. In this method, the Bark Frequency Cepstral Coefficients (BFCC) and Fractional Delta amplitude modulation spectrogram (FD-AMS) are used for the feature extraction. Moreover, the noises present in speech signals are predicted for eliminating the distorted noises and external calamities. Besides, the DRN classifier is utilized to improve the speech signal. The classifier is trained by newly devised optimization algorithm. Besides, the developed speech enhancement technique obtained better performance in terms of Perceptual Evaluation of Speech Quality (PESQ) with 2.646 and Root Mean Square Error (RMSE) with 0.0067. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:355 / 370
页数:15
相关论文
共 50 条
  • [31] Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition
    Vachhani, Bhavik
    Bhat, Chitralekha
    Das, Biswajit
    Kopparapu, Sunil Kumar
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1854 - 1858
  • [32] The Application of Deep Neural Network in Speech Enhancement Processing
    Chen Jian-ming
    Liang Zhi-cheng
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 1263 - 1266
  • [33] Speech Enhancement Using NMF based on Hierarchical Deep Neural Networks with Joint Learning
    Mirjalili, Mohammad Mahdi
    Seyedin, Sanaz
    2020 28TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2020, : 1411 - 1415
  • [34] Speech Enhancement via Residual Dense Generative Adversarial Network
    Zhou, Lin
    Zhong, Qiuyue
    Wang, Tianyi
    Lu, Siyuan
    Hu, Hongmei
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (03): : 279 - 289
  • [35] Speech Enhancement Method Based On LSTM Neural Network for Speech Recognition
    Liu, Ming
    Wang, Yujun
    Wang, Jin
    Wang, Jing
    Xie, Xiang
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 245 - 249
  • [36] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
    Shi, Wenhua
    Zhang, Xiongwei
    Zou, Xia
    Sun, Meng
    Han, Wei
    Li, Li
    Min, Gang
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
  • [37] Speech Enhancement Based on Deep Denoising Autoencoder
    Lu, Xugang
    Tsao, Yu
    Matsuda, Shigeki
    Hori, Chiori
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 436 - 440
  • [38] Speech enhancement based on perceptually comfortable residual noise
    Shin, Jong Won
    Chang, Joon-Hyuk
    Kim, Nam Soo
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2007, E90B (11) : 3323 - 3326
  • [39] An improved wavelet-based speech enhancement by using speech signal features
    Ayat, Saeed
    Manzuri-Shalmani, M. T.
    Dianat, Roohollah
    COMPUTERS & ELECTRICAL ENGINEERING, 2006, 32 (06) : 411 - 425
  • [40] Broad Phoneme Class Specific Deep Neural Network Based Speech Enhancement
    Karjol, Pavan
    Ghosh, Prasanta Kumar
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 372 - 376