Speech endpoint identification based on empirical mode decomposition

被引:1
作者
Yao, Zhen-Jie [1 ]
Huang, Hai [1 ]
Chen, Xiang-Xian [1 ]
机构
[1] Department of Instrumentation Science and Engineering, Zhejiang University
来源
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science) | 2009年 / 43卷 / 04期
关键词
Empirical mode decomposition(EMD); Instantaneous frequency; Speech endpoint identification;
D O I
10.3785/j.issn.1008-973X.2009.04.019
中图分类号
学科分类号
摘要
A new method based on the empirical mode decomposition (EMD) was proposed to identify speech-segment endpoints in noise-contaminated speech signals. Noisy speech signals were decomposed into a set of intrinsic mode functions (IMFs) using EMD. The average instantaneous frequencies of IMFs were estimated by their short time zero cross rate. The frames with low and slowly changing average instantaneous frequencies were identified to be the periodic sonant segments and the frames with high average instantaneous frequencies were identified to be the surd segments based on the characteristics of the average instantaneous frequencies of IMFs derived from speech signals. The final speech signals were obtained by processing and combining these segments. The numerical and experimental results show that the method can effectively identify the endpoints for the speeches contaminated by noises seriously.
引用
收藏
页码:705 / 709
页数:4
相关论文
共 4 条
[1]  
Huang E., Shen Z., Long R., Et al., The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Royal Society of London Proceedings Series A, 454, 1971, pp. 903-995, (1998)
[2]  
Wu Z.-H., Huang N.E., A study of the characteristics of white noise using the empirical mode decomposition method, Royal Society of London Proceedings Series A, 460, 2046, pp. 1597-1611, (2004)
[3]  
Huang E., Wu Z.-H., Long R., On instantaneous frequency, Proceedings of the 1st International Conference on the Advance of Hilbert-Huang Transform and its Applications, (2006)
[4]  
Veprek P., Scordilis S., Analysis, enhancement and evaluation of five pitch determination techniques, Speech Communication, 37, 3-4, pp. 249-270, (2002)