A two-level drive - Response model of non-stationary speech signals

被引:0
作者
Drepper, FR [1 ]
机构
[1] Forschungszentrum Julich GmbH, Zent Inst Elektron, D-52425 Julich, Germany
来源
NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING | 2005年 / 3817卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The transmission protocol of voiced speech is hypothesized to be based on a fundamental drive process, which synchronizes the vocal tract excitation on the transmitter side and evokes the pitch perception on the receiver side. A band limited fundamental drive is extracted from a voice specific subband decomposition of the speech signal. When the near periodic drive is used as fundamental drive of a two-level drive-response model, a more or less aperiodic voiced excitation can be reconstructed as a more or less aperiodic trajectory on a low dimensional continuous synchronization manifold (surface) described by speaker and phoneme specific coupling functions. In the case of vowels and nasals the excitation can be described by a univariate coupling function, which depends on the momentary phase of the fundamental drive. In the case of other voiced consonants the coupling function may as well depend on a delayed fundamental phase with a phoneme specific time delay. The delay may exceed the length of the analysis window. The resulting long range correlation cannot be analysed or synthesized by models assuming stationary excitation.
引用
收藏
页码:125 / 138
页数:14
相关论文
共 16 条
  • [1] DECHEVEIGNE A, 2001, EUR 2001 ALB
  • [2] DREPPER FR, 2005, FORTSCHRITTE AKUSTIK
  • [3] DREPPER FR, 2004, MAVEBA 2003
  • [4] Fant G., 1960, ACOUSTIC THEORY SPEE
  • [5] NONLINEAR DYNAMICS OF THE VOICE - SIGNAL ANALYSIS AND BIOMECHANICAL MODELING
    HERZEL, H
    BERRY, D
    TITZE, I
    STEINECKE, I
    [J]. CHAOS, 1995, 5 (01) : 30 - 34
  • [6] KANTZ H, 1997, NONLINEAR TIME SERIE
  • [7] KOCAREV L, 1996, PHYS REV LETT, V76, P1826
  • [8] KUBIN G, 1995, SPEECH CODING SYNTHE, P557
  • [9] Moakes P. A., 1994, ICSLP 94. 1994 International Conference on Spoken Language Processing, P1039
  • [10] Moore B.C.J., 1989, INTRO PSYCHOL HEARIN