Speech Emotion Recognition Two Decades in a Nutshell, Benchmarks, and Ongoing Trends

被引:283
作者
Schuller, Bjoern W. [1 ]
机构
[1] Univ Augsburg, Embedded Intelligence Hlth Care & Wellbeing, Augsburg, Germany
关键词
FEATURES; AUDIO; VOICE; PITCH;
D O I
10.1145/3129340
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
COMMUNICATION WITH COMPUTING machinery has become increasingly 'chatty' these days: Alexa, Cortana, Siri, and many more dialogue systems have hit the consumer market on a broader basis than ever, but do any of them truly notice our emotions and react to them like a human conversational partner would? In fact, the discipline of automatically recognizing human emotion and affective states from speech, usually referred to as Speech Emotion Recognition or SER for short, has by now surpassed the "age of majority, " celebrating the 22nd anniversary after the seminal work of Daellert et al. in 199610-arguably the first research paper on the topic. However, the idea has existed even longer, as the first patent dates back to the late 1970s. © 2018 ACM.
引用
收藏
页码:90 / 99
页数:10
相关论文
共 43 条
  • [21] Kim Y, 2013, INT CONF ACOUST SPEE, P3687, DOI 10.1109/ICASSP.2013.6638346
  • [22] Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features
    Koolagudi, Shashidhar
    Krothapalli, Sreenivasa
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (04) : 495 - 511
  • [23] ELIMINATION OF VERBAL CUES IN JUDGMENTS OF EMOTION FROM VOICE
    KRAMER, E
    [J]. JOURNAL OF ABNORMAL AND SOCIAL PSYCHOLOGY, 1964, 68 (04): : 390 - 396
  • [24] Voice-Only Communication Enhances Empathic Accuracy
    Kraus, Michael W.
    [J]. AMERICAN PSYCHOLOGIST, 2017, 72 (07) : 644 - 654
  • [25] Combining active learning and semi-supervised learning to construct SVM classifier
    Leng, Yan
    Xu, Xinyan
    Qi, Guanghui
    [J]. KNOWLEDGE-BASED SYSTEMS, 2013, 44 : 121 - 131
  • [26] Liu J, 2007, 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, P999
  • [27] Lotfian R, 2015, INT CONF ACOUST SPEE, P4759, DOI 10.1109/ICASSP.2015.7178874
  • [28] Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks
    Mao, Qirong
    Dong, Ming
    Huang, Zhengwei
    Zhan, Yongzhao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (08) : 2203 - 2213
  • [29] Computationally Modeling Human Emotion
    Marsella, Stacy
    Gratch, Jonathan
    [J]. COMMUNICATIONS OF THE ACM, 2014, 57 (12) : 56 - 67
  • [30] Ram CS, 2016, WORLD APPL SCI J, V34, P94, DOI [10 .5829 /idosi .wasj .2016 .34 .1.15637, DOI 10.5829/IDOSI.WASJ.2016.34.1.15637]