Perception or synthesized voice quality in connected speech by Cantonese speakers

被引:30
作者
Yiu, EML [1 ]
Murdoch, B
Hird, K
Lau, P
机构
[1] Univ Hong Kong, Dept Speech & Hearing Sci, Voice Res Lab, 5F Prince Philip Dent Hosp, Hong Kong, Hong Kong, Peoples R China
[2] Univ Queensland, Dept Speech Pathol & Audiol, St Lucia, Qld 4067, Australia
[3] Curtin Univ Technol, Sch Psychol, Bentley, WA 6102, Australia
关键词
D O I
10.1121/1.1500753
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Perceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation. (C) 2002 Acoustical Society of America.
引用
收藏
页码:1091 / 1101
页数:11
相关论文
共 26 条
[1]  
[Anonymous], 1996, FUNDAMENTALS BEHAV S
[2]   Analysis by synthesis of pathological voices using the Klatt synthesizer [J].
Bangayan, P ;
Long, C ;
Alwan, AA ;
Kreiman, J ;
Gerratt, BR .
SPEECH COMMUNICATION, 1997, 22 (04) :343-368
[3]   The effect of anchors and training on the reliability of perceptual voice evaluation [J].
Chan, KMK ;
Yiu, EML .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2002, 45 (01) :111-126
[4]   VOCAL QUALITY FACTORS - ANALYSIS, SYNTHESIS, AND PERCEPTION [J].
CHILDERS, DG ;
LEE, CK .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 90 (05) :2394-2410
[5]   MODELING THE GLOTTAL VOLUME-VELOCITY WAVE-FORM FOR 3 VOICE TYPES [J].
CHILDERS, DG ;
AHN, C .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (01) :505-519
[6]   SOME WAVEFORM AND SPECTRAL FEATURES OF VOWEL ROUGHNESS [J].
DEAL, RE ;
EMANUEL, FW .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1978, 21 (02) :250-264
[7]  
GERRATT BR, 1991, DYSARTHRIA AND APRAXIA OF SPEECH, P77
[8]   COMPARING INTERNAL AND EXTERNAL STANDARDS IN VOICE QUALITY JUDGMENTS [J].
GERRATT, BR ;
KREIMAN, J ;
ANTONANZASBARROSO, N ;
BERKE, GS .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (01) :14-20
[9]   PERCEPTUAL AND ACOUSTIC CORRELATES OF ABNORMAL VOICE QUALITIES [J].
HAMMARBERG, B ;
FRITZELL, B ;
GAUFFIN, J ;
SUNDBERG, J ;
WEDIN, L .
ACTA OTO-LARYNGOLOGICA, 1980, 90 (5-6) :441-451
[10]   SYNTHESIS OF BREATHY VOWELS - SOME RESEARCH METHODS [J].
HERMES, DJ .
SPEECH COMMUNICATION, 1991, 10 (5-6) :497-502