Vowel formant discrimination for high-fidelity speech

被引:35
作者
Liu, C [1 ]
Kewley-Port, D [1 ]
机构
[1] Indiana Univ, Dept Speech & Hearing Sci, Bloomington, IN 47405 USA
关键词
D O I
10.1121/1.1768958
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The goal of this study was to establish the ability of normal-hearing listeners to discriminate formant frequency in vowels in everyday speech. Vowel formant discrimination in syllables, phrases, and sentences was measured for high-fidelity (nearly natural) speech synthesized by STRAIGHT [Kawahara et al., Speech Commun. 27, 187-207 (1999)]. Thresholds were measured for changes in F1 and F2 for the vowels /I, epsilon, ae, Lambda/ in /bVd/ syllables. Experimental factors manipulated included phonetic context (syllables, phrases, and sentences), sentence discrimination with the addition of an identification task, and word position. Results showed that neither longer phonetic context nor the addition of the identification task significantly affected thresholds, while thresholds for word final position showed significantly better performance than for either initial or middle position in sentences. Results suggest that an average of 0.37, barks is required for normal-hearing listeners to discriminate vowel formants in modest length sentences, elevated by 84% compared to isolated vowels. Vowel formant discrimination in several phonetic contexts was slightly elevated for STRAIGHT-synthesized speech compared to formant-synthesized speech stimuli reported in the study by Kewley-Port and Zheng [J. Acoust. Soc. Am. 106, 2945-2958 (1999)]. These elevated thresholds appeared related to greater spectral-temporal variability for high-fidelity speech produced by STRAIGHT than for formant-synthesized speech. (C) 2004 Acoustical Society of America.
引用
收藏
页码:1224 / 1233
页数:10
相关论文
共 38 条
[1]   PSYCHOPHYSICAL THEORIES OF DURATION DISCRIMINATION [J].
ALLAN, LG ;
KRISTOFF.AB .
PERCEPTION & PSYCHOPHYSICS, 1974, 16 (01) :26-34
[2]  
ALLEN J, 1979, CONVERSION UNRESTRIC
[3]  
BISWAS A, 1997, J ACOUST SOC AM, V101, P3150
[4]  
BRADLOW AR, UNPUB THESIS CORNELL
[5]  
FLANAGAN J, 1955, J ACOUST SOC AM, V27, P288
[6]   Listeners do hear sounds, not tongues [J].
Fowler, CA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 99 (03) :1730-1741
[7]   DERIVATION OF AUDITORY FILTER SHAPES FROM NOTCHED-NOISE DATA [J].
GLASBERG, BR ;
MOORE, BCJ .
HEARING RESEARCH, 1990, 47 (1-2) :103-138
[8]  
GREENE BG, 1986, P HUM FACT SOC, V2, P1340
[9]   DIFFERENCE LIMENS FOR FORMANT PATTERNS OF VOWEL SOUNDS [J].
HAWKS, JW .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1994, 95 (02) :1074-1084
[10]   ACOUSTIC CHARACTERISTICS OF AMERICAN ENGLISH VOWELS [J].
HILLENBRAND, J ;
GETTY, LA ;
CLARK, MJ ;
WHEELER, K .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 97 (05) :3099-3111