Encoding Emotions in Speech with the Size Code A Perceptual Investigation

被引:35
作者
Chuenwattanapranithi, Suthathip [1 ]
Xu, Yi [2 ]
Thipakorn, Bundit [1 ]
Maneewongvatana, Songrit [1 ]
机构
[1] King Mongkuts Univ Technol Thonburi, Dept Comp Engn, Bangkok 10140, Thailand
[2] UCL, Dept Speech Hearing & Phonet Sci, London, England
关键词
D O I
10.1159/000192793
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Our current understanding of how emotions are expressed in speech is still very limited. Part of the difficulty has been the lack of understanding of the underlying mechanisms. Here we report the findings of a somewhat unconventional investigation of emotional speech. Instead of looking for direct acoustic correlates of multiple emotions, we tested a specific theory, the size code hypothesis of emotional speech, about two emotions - anger and happiness. According to the hypothesis, anger and happiness are conveyed in speech by exaggerating or understating the body size of the speaker. In two studies consisting of six experiments, we synthesized vowels with a three-dimensional articulatory synthesizer with parameter manipulations derived from the size code hypothesis, and asked Thai listeners to judge the body size and emotion of the speaker. Vowels synthesized with a longer vocal tract and lower F-0 were mostly heard as from a larger person if the length and F-0 differences were stationary, but from an angry person if the vocal tract was dynamically lengthened and F-0 was dynamically lowered. The opposite was true for the perception of small body size and happiness. These results provide preliminary support for the size code hypothesis. They also point to potential benefits of theory-driven investigations in emotion research. Copyright (C) 2009 S. Karger AG, Basel
引用
收藏
页码:210 / 230
页数:21
相关论文
共 73 条
[41]  
LASS NJ, 1978, J ACOUST SOC AM, V63, P1218, DOI 10.1121/1.381808
[42]   Retraction of the mobile descended larynx during groaning enables fallow bucks (Dama dama) to lower their formant frequencies [J].
McElligott, A. G. ;
Birrer, M. ;
Vannoni, E. .
JOURNAL OF ZOOLOGY, 2006, 270 (02) :340-345
[43]   Ensemble methods for spoken emotion recognition in call-centres [J].
Morrison, Donn ;
Wang, Ruili ;
De Silva, Liyanage C. .
SPEECH COMMUNICATION, 2007, 49 (02) :98-112
[44]  
Morton E.S., 1982, P183
[45]   OCCURRENCE AND SIGNIFICANCE OF MOTIVATION STRUCTURAL RULES IN SOME BIRD AND MAMMAL SOUNDS [J].
MORTON, ES .
AMERICAN NATURALIST, 1977, 111 (981) :855-869
[46]  
Mozziconacci S., 2002, P 1 INT C SPEECH PRO, P1
[47]   TOWARD THE SIMULATION OF EMOTION IN SYNTHETIC SPEECH - A REVIEW OF THE LITERATURE ON HUMAN VOCAL EMOTION [J].
MURRAY, IR ;
ARNOTT, JL .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1993, 93 (02) :1097-1108
[48]  
Negus VE, 1949, COMP ANATOMY PHYSL L
[49]   Speech emotion recognition using hidden Markov models [J].
Nwe, TL ;
Foo, SW ;
De Silva, LC .
SPEECH COMMUNICATION, 2003, 41 (04) :603-623
[50]   AN ETHOLOGICAL PERSPECTIVE ON COMMON CROSS-LANGUAGE UTILIZATION OF F0 OF VOICE [J].
OHALA, JJ .
PHONETICA, 1984, 41 (01) :1-16