Automatic Speech Recognition System Based on Hybrid Feature Extraction Techniques Using TEO-PWP for in Real Noisy Environment

被引:0
作者
Helali, Wafa [1 ]
Hajaiej, Zied [1 ]
Cherif, Adnen [1 ]
机构
[1] Univ Tunis El Manar, Res Unite Proc & Anal Elect & Energet Syst, Fac Sci Tunis, Tunis 2092, Tunisia
来源
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2019年 / 19卷 / 10期
关键词
Teager-Energy Operator TEO-PWP; Enhancement Speech; MFCC; PLP; RASTA-PLP; HMM; ENHANCEMENT; ENERGY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic speech recognition presents an interesting research area that has always attracted researchers to the general public. It is now giving rise to an important set of applications of a very varied nature and difficulty, involving millions of people around the world every day. In this paper, a model of speech recognition system in noisy environment is developed and analyzed. The proposed model relies on several hybrid feature extraction methods. Indeed, Teager-Energy Operator, Perceptual Wavelet Packet (TEO-PWP), Mel Cepstrum Coefficient (MFCC) and Perceptual Linear Production (PLP) are combined to construct a robust HMM based system. TIMIT database, which consist of both clean and noisy speech files recorded at different level of Speech-to-Noise Ratio (SNR, has been used for the system test. Results and observations are performed to prove the effectiveness of the proposed system relying on speech recognition rates.
引用
收藏
页码:118 / 124
页数:7
相关论文
共 36 条
[31]  
Sun J, 2014, INT CONF SIGN PROCES, P537, DOI 10.1109/ICOSP.2014.7015062
[32]   Speech enhancement using hidden Markov models in Mel-frequency domain [J].
Veisi, Hadi ;
Sameti, Hossein .
SPEECH COMMUNICATION, 2013, 55 (02) :205-220
[33]   Cepstral domain segmental feature vector normalization for noise robust speech recognition [J].
Viikki, O ;
Laurila, K .
SPEECH COMMUNICATION, 1998, 25 (1-3) :133-147
[34]   Acoustic features for speech recognition based on Gammatone filterbank and instantaneous frequency [J].
Yin, Hui ;
Hohmann, Volker ;
Nadeu, Climent .
SPEECH COMMUNICATION, 2011, 53 (05) :707-715
[35]  
Zhu JF, 2013, P ICWAPR, P14
[36]  
Zhu WZ, 2004, 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, P617