Decomposition of speech into voiced and unvoiced components based on a state-space signal model

被引:0
|
作者
Thomson, M [1 ]
Boland, S [1 ]
Wu, M [1 ]
Epps, J [1 ]
Smithers, M [1 ]
机构
[1] Motorola Labs, Botany, NSW, Australia
来源
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a novel method for decomposing speech into voiced and unvoiced components. After demodulating variations in spectral envelope, energy and pitch, the method involves applying a bank of Kalman filters to separate the harmonic and non-harmonic components of the signal. This approach relies on a state-space representation of the composite signal, and provides a way to accurately estimate the harmonic component without the large delay required by a linear phase comb filter. However it also requires prior knowledge of the variance of the unvoiced component and the state transition parameters. We present a novel method to accurately determine these parameters based on a variant of the Expectation-Maximization algorithm. Modifications for dealing with unvoiced segments and voicing onset are also described.
引用
收藏
页码:160 / 163
页数:4
相关论文
共 50 条
  • [1] Segregation of voiced and unvoiced components from residual of speech signal
    JO Cheol-woo
    KIM Jae-hee
    JournalofCentralSouthUniversity, 2012, 19 (02) : 496 - 503
  • [2] Segregation of voiced and unvoiced components from residual of speech signal
    Jo, Cheol-woo
    Kim, Jae-hee
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2012, 19 (02) : 496 - 503
  • [3] Segregation of voiced and unvoiced components from residual of speech signal
    Cheol-woo Jo
    Jae-hee Kim
    Journal of Central South University, 2012, 19 : 496 - 503
  • [4] IFAS-based voiced/unvoiced classification of speech signal
    Arifianto, D
    Kobayashi, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 812 - 815
  • [5] Speech enhancement based on a voiced-unvoiced speech model
    Goh, Z
    Tan, KC
    Tan, BTG
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 401 - 404
  • [6] MODEL BASED BINAURAL ENHANCEMENT OF VOICED AND UNVOICED SPEECH
    Kavalekalam, Mathew Shaji
    Christensen, Mads Graesboll
    Boldt, Jesper B.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 666 - 670
  • [7] The Complexity Analysis of Voiced and Unvoiced Speech Signal Based on Sample Entropy
    Sun, Guiqi
    Fan, Zhenyan
    Mastorakis, Nikos E.
    Kaminaris, Stavros D.
    Zhuang, Xiaodong
    2017 FOURTH INTERNATIONAL CONFERENCE ON MATHEMATICS AND COMPUTERS IN SCIENCES AND IN INDUSTRY (MCSI), 2017, : 26 - 29
  • [8] Stochastic glottal source applied to voiced-speech decomposition using state-space methods
    Alzamendi, Gabriel A.
    Schlottbauer, Gaston
    Torres, Maria E.
    2015 XVI WORKSHOP ON INFORMATION PROCESSING AND CONTROL (RPIC), 2015,
  • [9] Dual parameters for voiced-unvoiced speech signal determination
    Arifianto, Dhany
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 749 - 752
  • [10] DIRECTLY MODELING VOICED AND UNVOICED COMPONENTS IN SPEECH WAVEFORMS BY NEURAL NETWORKS
    Tokuda, Keiichi
    Zen, Heiga
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5640 - 5644