The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study

被引:20
作者
Birkholz, Peter [1 ,2 ]
Martin, Lucia [1 ,2 ]
Willmes, Klaus [2 ,3 ]
Kroeger, Bernd J. [1 ,2 ]
Neuschaefer-Rube, Christiane [1 ,2 ]
机构
[1] Univ Hosp Aachen, Dept Phoniatr Pedaudiol & Commun Disorders, D-52074 Aachen, Germany
[2] Rhein Westfal TH Aachen, D-52074 Aachen, Germany
[3] Univ Hosp Aachen, Dept Neurol, Sect Neuropsychol, D-52074 Aachen, Germany
关键词
VOICE QUALITY; SPEECH; EXPRESSION; MODEL; SIMULATION;
D O I
10.1121/1.4906836
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Vocal emotions are signaled by specific patterns of prosodic parameters, most notably pitch, phone duration, intensity, and phonation type. Phonation type was so far the least accessible parameter in emotion research, because it was difficult to extract from speech signals and difficult to manipulate in natural or synthetic speech. The present study built on recent advances in articulatory speech synthesis to exclusively control phonation type in re-synthesized German sentences spoken with seven different emotions. The goal was to find out to what extent the sole change of phonation type affects the perception of these emotions. Therefore, portrayed emotional utterances were re-synthesized with their original phonation type, as well as with each purely breathy, modal, and pressed phonation, and then rated by listeners with respect to the perceived emotions. Highly significant effects of phonation type on the recognition rates of the original emotions were found, except for disgust. While fear, anger, and the neutral emotion require specific phonation types for correct perception, sadness, happiness, boredom, and disgust primarily rely on other prosodic parameters. These results can help to improve the expression of emotions in synthesized speech and facilitate the robust automatic recognition of vocal emotions. (C) 2015 Acoustical Society of America.
引用
收藏
页码:1503 / 1512
页数:10
相关论文
共 48 条
[1]   Emotions in vowel segments of continuous speech: Analysis of the glottal flow using the normalised amplitude quotient [J].
Airas, M ;
Alku, P .
PHONETICA, 2006, 63 (01) :26-46
[2]  
[Anonymous], BERLIN DATABASE EMOT
[3]  
[Anonymous], STUDIENTEXTE SPRACHK
[4]  
[Anonymous], P ICPHS
[5]  
[Anonymous], P INT 2009
[6]  
[Anonymous], P INT 2011
[7]  
[Anonymous], P 14 INT C PHON SCI
[8]  
[Anonymous], P 7 INT SEM SPEECH P
[9]  
[Anonymous], P 14 INT C PHON SCI
[10]  
[Anonymous], P INT 2007 EUR