A Scoping Review of the Literature On Prosodic Elements Related to Emotional Speech in Human-Robot Interaction

被引:7
作者
Gasteiger, Norina [1 ,2 ]
Lim, JongYoon [1 ]
Hellou, Mehdi [1 ,3 ]
MacDonald, Bruce A. [1 ]
Ahn, Ho Seok [1 ]
机构
[1] Univ Auckland, Dept Elect Comp & Software Engn, Auckland 1142, New Zealand
[2] Univ Manchester, Sch Hlth Sci, Manchester, Lancs, England
[3] Sorbonne Univ, Facultes Sci & Ingenieries, Dept Informat, Paris, France
关键词
affective computing; speech; HRI; robotics; social robots; sentiment; EXPRESSION; FUTURE; WORDS; MODEL; TEXT;
D O I
10.1007/s12369-022-00913-x
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Background Sentiment expression and detection are crucial for effective and empathetic human-robot interaction. Previous work in this field often focuses on non-verbal emotion expression, such as facial expressions and gestures. Less is known about which specific prosodic speech elements are required in human-robot interaction. Our research question was: what prosodic elements are related to emotional speech in human-computer/robot interaction? Methods The scoping review was conducted in alignment with the Arksey and O'Malley methods. Literature was identified from the SCOPUS, IEEE Xplore, ACM Digital Library and PsycINFO databases in May 2021. After screening and de-duplication, data were extracted into an Excel coding sheet and summarised. Results Thirteen papers, published from 2012 to 2020 were included in the review. The most commonly used prosodic elements were tone/pitch (n = 8), loudness/volume (n = 6) speech speed (n = 4) and pauses (n = 3). Non-linguistic vocalisations (n = 1) were less frequently used. The prosodic elements were generally effective in helping to convey or detect emotion, but were less effective for negative sentiment (e.g., anger, fear, frustration, sadness and disgust). Discussion Future research should explore the effectiveness of commonly used prosodic elements (tone, loudness, speed and pauses) in emotional speech, using larger sample sizes and real-life interaction scenarios. The success of prosody in conveying negative sentiment to humans may be improved with additional non-verbal cues (e.g., coloured light or motion). More research is needed to determine how these may be combined with prosody and which combination is most effective in human-robot affective interaction.
引用
收藏
页码:659 / 670
页数:12
相关论文
共 50 条
[1]   Effects of affective and emotional congruency on facial expression processing under different task demands [J].
Aguado, Luis ;
Martinez-Garcia, Natalia ;
Solis-Olce, Andrea ;
Dieguez-Risco, Teresa ;
Antonio Hinojosa, Jose .
ACTA PSYCHOLOGICA, 2018, 187 :66-76
[2]  
Aly A, 2015, IEEE INT C INT ROBOT, P2986, DOI 10.1109/IROS.2015.7353789
[3]  
Aman S., 2007, IDENTIFYING EXPRESSI, DOI [10.1007/978-3-540-74628-7_27, DOI 10.1007/978-3-540-74628-7_27]
[4]   Asking the right questions: Scoping studies in the commissioning of research on the organisation and delivery of health services [J].
Anderson S. ;
Allen P. ;
Peckham S. ;
Goodwin N. .
Health Research Policy and Systems, 6 (1)
[5]  
[Anonymous], SOFTBANK ROBOTICS
[6]   My robot is happy today: how older people with mild cognitive impairments understand assistive robots' affective output [J].
Antona, Margherita ;
Ioannidi, Danai ;
Foukarakis, Michalis ;
Gerlowska, Justyna ;
Rejdak, Konrad ;
Abdelnour, Carla ;
Hernandez, Joan ;
Tantinya, Natalia ;
Roberto, Natalia .
12TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS (PETRA 2019), 2019, :416-424
[7]  
Arksey H., 2005, INT J SOC RES METHOD, V8, P19, DOI DOI 10.1080/1364557032000119616
[8]   Detection of Affective States From Text and Speech for Real-Time Human-Computer Interaction [J].
Calix, Ricardo A. ;
Javadpour, Leili ;
Knapp, Gerald M. .
HUMAN FACTORS, 2012, 54 (04) :530-545
[9]  
Crumpton J, 2014, 23 IEEE INT S ROB HU
[10]  
Crystal D., 1975, The English tone of voice. Essays in intonation