Exploiting Nonlinear recurrence and Fractal scaling properties for voice disorder detection

被引:494
作者
Little, Max A.
McSharry, Patrick E.
Roberts, Stephen J.
Costello, Declan A. E.
Moroz, Irene M.
机构
[1] Univ Oxford, Dept Engn Sci, Modelling & Predict Grp, Oxford OX1 3PJ, England
[2] Univ Oxford, Dept Engn Sci, Pattern Anal Res Grp, Oxford OX1 3PJ, England
[3] Univ Oxford, Math Inst, Oxford Ctr Ind & Appl Math, Appl Dynam Syst Res Grp, Oxford OX1 3PJ, England
[4] Milton Keynes Dist Gen Hosp, Milton Keynes MK6 5LD, Bucks, England
关键词
D O I
10.1186/1475-925X-6-23
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness. Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices. Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8 +/- 2.0%. The true positive classification performance is 95.4 +/- 3.2%, and the true negative performance is 91.5 +/- 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools. Conclusion: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.
引用
收藏
页数:19
相关论文
共 58 条
  • [1] Acheson D. J., 1990, ELEMENTARY FLUID DYN
  • [2] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION
    AKAIKE, H
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) : 716 - 723
  • [3] Alonso J. B., 2001, EURASIP Journal on Applied Signal Processing, V2001, P275, DOI 10.1155/S1110865701000336
  • [4] Recurrence time analysis, long-term correlations, and extreme events
    Altmann, EG
    Kantz, H
    [J]. PHYSICAL REVIEW E, 2005, 71 (05)
  • [5] Baken R. J., 2000, Clinical Measurement of Speech and Voice, V2nd ed
  • [6] Speech characterization and synthesis by nonlinear methods
    Banbrook, M
    McLaughlin, S
    Mann, I
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 1 - 17
  • [7] Correlation dimension of electroglottographic data from healthy and pathologic subjects
    Behrman, A
    Baken, RJ
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1997, 102 (04) : 2371 - 2379
  • [8] Bifurcations in excised larynx experiments
    Berry, DA
    Herzel, H
    Titze, IR
    Story, BH
    [J]. JOURNAL OF VOICE, 1996, 10 (02) : 129 - 138
  • [9] Bishop CM., 1995, Neural networks for pattern recognition
  • [10] Boersma, 1993, P I PHON SCI, V17