Decomposition of speech into voiced and unvoiced components based on a state-space signal model

被引：0

作者：

Thomson, M ^{[1
]}

Boland, S ^{[1
]}

Wu, M ^{[1
]}

Epps, J ^{[1
]}

Smithers, M ^{[1
]}

机构：

[1] Motorola Labs, Botany, NSW, Australia

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present a novel method for decomposing speech into voiced and unvoiced components. After demodulating variations in spectral envelope, energy and pitch, the method involves applying a bank of Kalman filters to separate the harmonic and non-harmonic components of the signal. This approach relies on a state-space representation of the composite signal, and provides a way to accurately estimate the harmonic component without the large delay required by a linear phase comb filter. However it also requires prior knowledge of the variance of the unvoiced component and the state transition parameters. We present a novel method to accurately determine these parameters based on a variant of the Expectation-Maximization algorithm. Modifications for dealing with unvoiced segments and voicing onset are also described.

引用

页码：160 / 163

页数：4

共 50 条

[1] Segregation of voiced and unvoiced components from residual of speech signal
JO Cheol-woo
KIM Jae-hee
JournalofCentralSouthUniversity, 2012, 19 (02) : 496 - 503
[2] Segregation of voiced and unvoiced components from residual of speech signal
Jo, Cheol-woo
Kim, Jae-hee
JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
[3] Segregation of voiced and unvoiced components from residual of speech signal
Cheol-woo Jo
Jae-hee Kim
Journal of Central South University, 2012, 19 : 496 - 503
[4] IFAS-based voiced/unvoiced classification of speech signal
Arifianto, D
Kobayashi, T
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 812 - 815
[5] Speech enhancement based on a voiced-unvoiced speech model
Goh, Z
Tan, KC
Tan, BTG
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 401 - 404
[6] MODEL BASED BINAURAL ENHANCEMENT OF VOICED AND UNVOICED SPEECH
Kavalekalam, Mathew Shaji
Christensen, Mads Graesboll
Boldt, Jesper B.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 666 - 670
[7] The Complexity Analysis of Voiced and Unvoiced Speech Signal Based on Sample Entropy
Sun, Guiqi
Fan, Zhenyan
Mastorakis, Nikos E.
Kaminaris, Stavros D.
Zhuang, Xiaodong
2017 FOURTH INTERNATIONAL CONFERENCE ON MATHEMATICS AND COMPUTERS IN SCIENCES AND IN INDUSTRY (MCSI), 2017, : 26 - 29
[8] Stochastic glottal source applied to voiced-speech decomposition using state-space methods
Alzamendi, Gabriel A.
Schlottbauer, Gaston
Torres, Maria E.
2015 XVI WORKSHOP ON INFORMATION PROCESSING AND CONTROL (RPIC), 2015,
[9] Dual parameters for voiced-unvoiced speech signal determination
Arifianto, Dhany
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 749 - 752
[10] DIRECTLY MODELING VOICED AND UNVOICED COMPONENTS IN SPEECH WAVEFORMS BY NEURAL NETWORKS
Tokuda, Keiichi
Zen, Heiga
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5640 - 5644

← 1 2 3 4 5 →