Enhancement of speech using deep neural network with discrete cosine transform

被引:2
作者
Ram, Rashmirekha [1 ]
Mohanty, Mihir Narayan [1 ]
机构
[1] Siksha O Anusandhan Univ Deemed Be Univ, Dept Elect & Communicaton Engn, Bhubaneswar, India
关键词
Discrete cosine transform; deep neural network; speech enhancement; perceptual evaluation of speech quality; segmental signal-to-noise; NOISE; INTELLIGIBILITY;
D O I
10.3233/JIFS-169575
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this digitized world, the demand of users emphasizes the quality and accuracy. Practically, all variants of signals are analog in nature along with contaminated with noise. In this paper, speech signal is considered. Basically speech signal varies from person to person and time to time. It requires enhancement of the signal for different applications like engineering, medicine and social purposes. Reduction of noise as well as redundant data from the signal can be produced with enhanced versions. As the speech is of nonstationary in nature, in the initial phase, it is processed and normalized. To analyze the speech signal, spectral domain is most suitable and has been utilized. For this purpose, Discrete Cosine Transform (DCT-II) is used. As it has the advantage over other transforms and the calculation is simpler, DCT-II coefficients are further used for Deep Neural Network (DNN) model to reduce the noise and enhance the signal. So that the signal of any environment and of any amount can be enhanced using this model. 100 sentences have been collected form both males and females of 5 each. The sentences have been uttered by the corresponding males and females, 10 sentences each. Though DCT-II and DNN have been applied by many researchers for signal features and image classification, the same have been utilized here for speech enhancement, which is the novelty of this work. The results found better than the other methods applied earlier and it can be best utilized for any real time application. In the result section, the visual inspection is exhibited along with the comparison values. The measuring parameters show its efficacy.
引用
收藏
页码:141 / 148
页数:8
相关论文
共 32 条
[1]  
[Anonymous], 2007, Speech Enhancement: Theory and Practice
[2]  
Apolloni B., 2009, FRONTIERS ARTIFICIAL, V193
[3]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[4]  
Cao W., 2017, REV NEURAL NETWORKS
[5]  
Chaudhari A., 2015, INT C PERV COMP ICPC
[6]  
Daqrouq K., 2009, INT MULT SYST SIGN D
[7]  
Dufera B. D., 2009, INT S INT SIGN PROC
[8]  
Fah L. B., 2000, SPEECH ENHANCEMENT N
[9]   Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users [J].
Goehring, Tobias ;
Bolner, Federico ;
Monaghan, Jessica J. M. ;
van Dijk, Bas ;
Zarowski, Andrzej ;
Bleeck, Stefan .
HEARING RESEARCH, 2017, 344 :183-194
[10]  
Haykin SO., 2014, Adaptive Filter Theory, V5th edition