The fractal properties of vocal sounds and their application in the speech recognition model

被引:27
|
作者
Sabanal, S
Nakagawa, M
机构
[1] Department of Electrical Engineering, Faculty of Engineering, Nagaoka University of Technology, Nagaoka, Niigata 940-21
关键词
D O I
10.1016/S0960-0779(96)00043-4
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this work, we shall examine the fractal properties of simple vocal sounds such as Japanese vowels by evaluating the fractal dimension related to the self-affine property. We shall also examine the existence of chaos in the attractors reconstructed from the vocal sound waveforms by evaluating the Lyapunov exponents. The reconstructed attractors are also examined for multifractal properties. To characterize the fractal properties of complicated vocal sounds, such as speech utterances composed of several vowels, phonemes, etc., we shall propose the time-dependent fractal dimensions (TDFDs), where the fractal dimensions are evaluated based on the self-affine:property, and the time-dependent multifractal dimensions (TDMFDs). We shall then use these fractal properties in a speech recognition model to examine if our method is able to characterize complicated vocal sounds effectively. For comparison, we shall utilize the running power spectrum (RPS) as a recognition parameter. It was found that utilizing the fractal properties of vocal sounds as recognition parameters gives a high recognition rate, showing that complicated vocal sounds can be effectively characterized by their fractal properties. Copyright (C) 1996 Elsevier Science Ltd.
引用
收藏
页码:1825 / 1843
页数:19
相关论文
共 50 条
  • [31] On the Impact of Non-speech Sounds on Speaker Recognition
    Janicki, Artur
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 566 - 572
  • [32] A speech recognition approach with MFCC and fractal dimension
    Yao, Minghai
    Hu, Jing
    DCABES 2006 Proceedings, Vols 1 and 2, 2006, : 349 - 351
  • [33] Development of speech recognition system for remote vocal music teaching based on Markov model
    Xu, Fumei
    Xia, Yu
    SOFT COMPUTING, 2023, 27 (14) : 10237 - 10248
  • [34] A Review of the Application of Fractal in Recognition
    Yuan, Yongqin
    Liu, Fengming
    Zhou, Chuiyun
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON MATERIAL SCIENCE, ENERGY AND ENVIRONMENTAL ENGINEERING (MSEEE 2017), 2017, 125 : 44 - 49
  • [35] THE APPLICATION OF FRACTAL DIMENSION TO TEMPOROMANDIBULAR-JOINT SOUNDS
    BADWAL, RSS
    COMPUTERS IN BIOLOGY AND MEDICINE, 1993, 23 (01) : 1 - 14
  • [36] Impact of vocal effort variability on automatic speech recognition
    Zelinka, Petr
    Sigmund, Milan
    Schimmel, Jiri
    SPEECH COMMUNICATION, 2012, 54 (06) : 732 - 742
  • [37] Vocal Tract Representation in the Recognition of Cerebral Palsied Speech
    Rudzicz, Frank
    Hirst, Graeme
    van Lieshout, Pascal
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2012, 55 (04): : 1190 - 1207
  • [38] Vocal interaction: beyond traditional automatic speech recognition
    Kurniawan, Sri
    Sporka, Adam J.
    Harada, Susumu
    UNIVERSAL ACCESS IN THE INFORMATION SOCIETY, 2009, 8 (02) : 63 - 64
  • [39] Vocal interaction: beyond traditional automatic speech recognition
    Sri Kurniawan
    Adam J. Sporka
    Susumu Harada
    Universal Access in the Information Society, 2009, 8 : 63 - 64
  • [40] ESTIMATION OF VOCAL-TRACT SHAPES FROM ACOUSTIC ANALYSIS OF SPEECH SOUNDS
    WAKITA, H
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 : S37 - S37