Acoustic analysis and digital signal processing for the assessment of voice quality

被引：9

作者：

Jalali-najafabadi, Farideh ^{[1
]}

Gadepalli, Chaitanya ^{[2
]}

Jarchi, Delaram ^{[3
]}

Cheetham, Barry M. G. ^{[4
]}

机构：

[1] Univ Manchester, Manchester Acad, Ctr Musculoskeletal Res,Hlth Sci Ctr, Ctr Genet & Genom Versus Arthrit,Fac Biol Med & H, Oxford Rd, Manchester, Lancs, England

[2] Salford Royal NHS Fdn Trust, Manchester, Lancs, England

[3] Univ Essex, Sch Comp Sci & Elect Engn, Colchester, Essex, England

[4] Univ Manchester, Sch Comp Sci, Manchester, Lancs, England

来源：

BIOMEDICAL SIGNAL PROCESSING AND CONTROL | 2021年 / 70卷

基金：

英国医学研究理事会;

关键词：

Praat; MDVP; Speech; Acoustic; HNR; SNR; Shimmer; Jitter; Fundamental frequency (f(o)); TO-NOISE RATIO; JITTER; SHIMMER; SPEECH; PERCEPTION;

D O I：

10.1016/j.bspc.2021.103018

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Purpose: This paper addresses the application of digital signal processing (DSP) techniques to the robust measurement of acoustical features of the human voice. It then addresses the use of regression based techniques for the estimation of grade, roughness, breathiness, asthenia and strain, from these acoustical features. These five properties of voice are the basis of the widely used 'GRBAS' characterisation of voice disorders. Method: A well-known cross-correlation technique has been enhanced for more reliably measuring the fundamental frequency of vowels which is crucial for the derivation of acoustic features such as the harmonic to-noise-ratio, jitter and shimmer. Regression techniques including K-Nearest Neighbour Regression and Multiple Linear Regression are employed for derivation of GRBAS properties. Results: Validation of the enhanced cross-correlation technique against well established published or commercially available techniques has been carried out by analysing synthetic sustained vowels. It was found that the enhanced method is capable of producing more reliable and robust measurements, in the context of our experiments, than the well-established Praat technique and Multi-Dimensional-Voice-Program (MDVP) software, especially in cases where the signal to noise ratio is low. Estimation of GRBAS components using our methods has been found to be in good agreement with traditional GRBAS scoring by speech and language therapists (SLTs). Conclusion: Voice analysis using DSP to extract acoustic features has the potential for objective and computerised GRBAS voice assessment. Such assessment can usefully augment GRBAS assessment as traditionally carried out subjectively by SLTs.

引用

页数：11

共 46 条

[1] Association between acoustic speech features and non-severe levels of anxiety and depression symptoms across lifespan [J].

Albuquerque, Luciana ;

Valente, Ana Rita S. ;

Teixeira, Antonio ;

Figueiredo, Daniela ;

Sa-Couto, Pedro ;

Oliveira, Catarina .

PLOS ONE, 2021, 16 (04)

[2]

[Anonymous], 2007, P 8 ANN C INT SPEECH

[3] IMPORTANCE OF THE PSYCHOSOCIAL INTERVIEW IN THE DIAGNOSIS AND TREATMENT OF FUNCTIONAL VOICE DISORDERS [J].

ARONSON, AE .

JOURNAL OF VOICE, 1990, 4 (04) :287-289

[4] Voice Quality Evaluation in Patients With COVID-19: An Acoustic Analysis [J].

Asiaee, Maral ;

Vahedian-azimi, Amir ;

Atashi, Seyed Shahab ;

Keramatfar, Abdalsamad ;

Nourbakhsh, Mandana .

JOURNAL OF VOICE, 2022, 36 (06) :879.e13-879.e19

[5] IMPROVEMENTS IN ESTIMATING THE HARMONICS-TO-NOISE RATIO OF THE VOICE [J].

AWAN, SN ;

FRENKEL, ML .

JOURNAL OF VOICE, 1994, 8 (03) :255-262

[6] Perceptual evaluation of voice quality and its correlation with acoustic measurements [J].

Bhuta, T ;

Patrick, L ;

Garnett, JD .

JOURNAL OF VOICE, 2004, 18 (03) :299-304

[7]

Boersma P., 1993, PROC I PHONETIC SCI, V17, P97

[8]

Boersma P., 2007, Praat: doing phonetics by computer

[9] Should Jitter Be Measured by Peak Picking or by Waveform Matching? [J].

Boersma, Paul .

FOLIA PHONIATRICA ET LOGOPAEDICA, 2009, 61 (05) :305-308

[10] Voice Loudness and Gender Effects on Jitter and Shimmer in Healthy Adults [J].

Brockmann, Meike ;

Storck, Claudio ;

Carding, Paul N. ;

Drinnan, Michael J. .

JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2008, 51 (05) :1152-1160

← 1 2 3 4 5 →