Glottal inverse filtering analysis of human voice production - A review of estimation and parameterization methods of the glottal excitation and their applications

被引:144
作者
Alku, Paavo [1 ]
机构
[1] Aalto Univ, Dept Signal Proc & Acoust, Aalto 00076, Finland
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2011年 / 36卷 / 05期
基金
芬兰科学院;
关键词
Speech; voice; speech production; inverse filtering; glottis; FLOW WAVE-FORM; AIR-FLOW; VOCAL-INTENSITY; FUNDAMENTAL-FREQUENCY; SUBGLOTTAL PRESSURE; SPEECH SYNTHESIS; AMPLITUDE QUOTIENT; EPOCH EXTRACTION; FEMALE SPEAKERS; SOUND PRESSURE;
D O I
10.1007/s12046-011-0041-5
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Glottal inverse filtering (GIF) refers to methods of estimating the source of voiced speech, the glottal volume velocity waveform. GIF is based on the idea of inversion, in which the effects of the vocal tract and lip radiation are cancelled from the output of the voice production mechanism, the speech signal. This article provides a review on GIF research by examining an era spanning five decades during which this topic has been under development. The topic is handled from three main perspectives: the estimation methods of the glottal source, the parameterization techniques that have been developed to express the estimated glottal excitations in numerical forms, and the application areas of GIF. Finally, the strengths and limitations of the GIF approach are discussed.
引用
收藏
页码:623 / 650
页数:28
相关论文
共 211 条