The fractal properties of vocal sounds and their application in the speech recognition model

被引:27
|
作者
Sabanal, S
Nakagawa, M
机构
[1] Department of Electrical Engineering, Faculty of Engineering, Nagaoka University of Technology, Nagaoka, Niigata 940-21
关键词
D O I
10.1016/S0960-0779(96)00043-4
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this work, we shall examine the fractal properties of simple vocal sounds such as Japanese vowels by evaluating the fractal dimension related to the self-affine property. We shall also examine the existence of chaos in the attractors reconstructed from the vocal sound waveforms by evaluating the Lyapunov exponents. The reconstructed attractors are also examined for multifractal properties. To characterize the fractal properties of complicated vocal sounds, such as speech utterances composed of several vowels, phonemes, etc., we shall propose the time-dependent fractal dimensions (TDFDs), where the fractal dimensions are evaluated based on the self-affine:property, and the time-dependent multifractal dimensions (TDMFDs). We shall then use these fractal properties in a speech recognition model to examine if our method is able to characterize complicated vocal sounds effectively. For comparison, we shall utilize the running power spectrum (RPS) as a recognition parameter. It was found that utilizing the fractal properties of vocal sounds as recognition parameters gives a high recognition rate, showing that complicated vocal sounds can be effectively characterized by their fractal properties. Copyright (C) 1996 Elsevier Science Ltd.
引用
收藏
页码:1825 / 1843
页数:19
相关论文
共 50 条
  • [1] Fractal dimensions of speech sounds: Computation and application to automatic speech recognition
    Maragos, P
    Potamianos, A
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1999, 105 (03): : 1925 - 1932
  • [2] Chaotic and fractal properties of vocal sounds
    Koga, H
    Nakagawa, M
    JOURNAL OF THE KOREAN PHYSICAL SOCIETY, 2002, 40 (06) : 1027 - 1031
  • [3] VOCAL TRACT MODEL CREATES ACCURATE SPEECH SOUNDS
    不详
    BELL LABORATORIES RECORD, 1967, 45 (01): : 24 - &
  • [4] Improved vocal tract model for the analysis of nasal speech sounds
    Liu, MS
    Lacroix, A
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 801 - 804
  • [5] RETRIEVING SOUNDS BY VOCAL IMITATION RECOGNITION
    Zhang, Yichi
    Duan, Zhiyao
    2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2015,
  • [6] SIMULATION OF RECOGNITION OF VOCAL SOUNDS BY APPRENTICESHIP
    LAMOTTE, M
    BREMONT, J
    HATON, JP
    COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES SERIE A, 1969, 269 (06): : 286 - &
  • [7] EFFECTS OF VOCAL FORCE ON THE INTELLIGIBILITY OF SPEECH SOUNDS
    PICKETT, JM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1956, 28 (05): : 902 - 905
  • [8] Estimation of vocal tract shapes from speech sounds with a physiological articulatory model
    Dang, JW
    Honda, K
    JOURNAL OF PHONETICS, 2002, 30 (03) : 511 - 532
  • [9] An Algorithm for Detection of Breath Sounds in Spontaneous Speech with Application to Speaker Recognition
    Dumpala, Sri Harsha
    Alluri, K. N. R. K. Raju
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 98 - 108
  • [10] A STUDY OF TIME-DEPENDENT FRACTAL DIMENSIONS OF VOCAL SOUNDS
    SABANAL, S
    NAKAGAWA, M
    JOURNAL OF THE PHYSICAL SOCIETY OF JAPAN, 1995, 64 (09) : 3226 - 3238