Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech

被引:30
作者
Agiomyrgiannakis, Yannis [1 ,2 ]
Stylianou, Yannis [1 ,2 ]
机构
[1] FORTH, Inst Comp Sci, GR-70013 Iraklion, Crete, Greece
[2] Univ Crete, Dept Comp Sci, Multimedia Informat Lab, Iraklion 71409, Crete, Greece
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2009年 / 17卷 / 04期
关键词
Circular statistics; phase quantization; sinusoidal models; speech analysis; speech coding; voice-over-IP; wrapped Gaussian mixture models (WGMMs);
D O I
10.1109/TASL.2008.2008229
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The harmonic representation of speech signals has found many applications in speech processing. This paper presents a novel statistical approach to model the behavior of harmonic phases. Phase information is decomposed into three parts: a minimum phase part, a translation term, and a residual term referred to as dispersion phase. Dispersion phases are modeled by wrapped Gaussian mixture models (WGMMs) using an expectation-maximization algorithm suitable for circular vector data. A multivariate WGMM-based phase quantizer is then proposed and constructed using novel scalar quantizers for circular random variables. The proposed phase modeling and quantization scheme is evaluated in the context of a narrowband harmonic representation of speech. Results indicate that it is possible to construct a variable-rate harmonic codec that is equivalent to iLBC at approximately 13 kbps.
引用
收藏
页码:775 / 786
页数:12
相关论文
共 38 条
[1]  
AGIOMYRGIANNAKI.Y, 2007, THESIS U CRETE CRETE
[2]  
AGIOMYRGIANNAKI.Y, 2007, P INT ANTW BELG, P1681
[3]  
AGUILAR G, 2000, P ICASSP IST TURK, P141
[4]  
AHMADI S, 1997, P ICASSP MUN GERM, V3, P1675
[5]  
Andersen SV, 2002, 2002 IEEE SPEECH CODING WORKSHOP PROCEEDINGS, P23, DOI 10.1109/SCW.2002.1215711
[6]  
[Anonymous], 2001, DISCRETE TIME SPEECH
[7]  
[Anonymous], P INT
[8]  
[Anonymous], 2004, Digital Speech: Coding for Low Bit Rate Communication Systems
[9]  
[Anonymous], 2001, ITU-T Rec. P. 862
[10]   Directional features in online handwriting recognition [J].
Bahlmann, C .
PATTERN RECOGNITION, 2006, 39 (01) :115-125