A Novel Method of Glottal Inverse Filtering

被引：9

作者：

Sahoo, Subhasmita ^{[1
]}

Routray, Aurobinda ^{[1
]}

机构：

[1] Indian Inst Technol Kharagpur, Dept Elect Engn, Kharagpur 721302, W Bengal, India

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2016年 / 24卷 / 07期

关键词：

Concatenated tube model; extended Kalman filter; glottal inverse filtering; LF model; multiple model estimation; RECOGNITION;

D O I：

10.1109/TASLP.2016.2551864

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a new technique for glottal inverse filtering using a distributed model of the vocal tract. A discrete state space model has been constructed for the speech production system by combining the concatenated tube model of the vocal tract and Liljencrants-Fant (LF) model of the glottal flow derivative waveform. An adaptive system identification technique, based on extended Kalman filtering, has been used for estimation of the states and model parameters from continuous speech. The glottal signal, represented by the LF model, is piecewise differentiable in one glottal cycle. Hence, the hybrid system has been characterized by separate models during two different modes. Multiple model estimation has been performed by switching between the two models at the mode jumps. The open phase of the glottal cycle has been considered as Mode 1; whereas, the return phase and closed phase combined has been taken as Mode 2. The starting point of Mode 1, also known as glottal opening instant, was estimated by observing formant modulation, which remains negligible during closed phase, and starts to increase at the onset of opening. The starting point of Mode 2, also known as the glottal closing instant, was computed by peak-picking from linear prediction (LP) residual signal. The proposed method estimates the glottal waveform as well as changes in flow occurring at different sections of the vocal tract during speech production. This technique has been found to be accurate and robust to variations in pitch as compared to other LP-based methods in the literature. The method also estimates the air pressure distribution at different sections of the vocal tract.

引用

页码：1230 / 1241

页数：12

共 34 条

[1] Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction [J].

Airaksinen, Manu ;

Raitio, Tuomo ;

Story, Brad ;

Alku, Paavo .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (03) :596-607

[2]

Airas M, 2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, P1333

[3] Normalized amplitude quotient for parametrization of the glottal flow [J].

Alku, P ;

Bäckström, T ;

Vilkman, E .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 112 (02) :701-710

[4] GLOTTAL WAVE ANALYSIS WITH PITCH SYNCHRONOUS ITERATIVE ADAPTIVE INVERSE FILTERING [J].

ALKU, P .

SPEECH COMMUNICATION, 1992, 11 (2-3) :109-118

[5] Glottal inverse filtering analysis of human voice production - A review of estimation and parameterization methods of the glottal excitation and their applications [J].

Alku, Paavo .

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05) :623-650

[6]

[Anonymous], THEORY APPL DIGITAL

[7]

[Anonymous], 1985, STL-QPSR

[8]

[Anonymous], 2013, THESIS

[9]

[Anonymous], 2001, Discrete-Time Speech Signal Processing:Principles and Practice

[10]

[Anonymous], 8 ANN C INT SPEECH C

← 1 2 3 4 →