ATTITUDE RECOGNITION USING MULTI-RESOLUTION COCHLEAGRAM FEATURES

被引:0
|
作者
Haider, Fasih [1 ]
Luz, Saturnino [1 ]
机构
[1] Univ Edinburgh, Edinburgh Med Sch, Usher Inst Populat Hlth Sci & Informat, Edinburgh, Midlothian, Scotland
关键词
Feature Engineering; Attitude Recognition; Affect Recognition; Multi-Resolution Cochleagram; Video Blogs;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Attitudes play an important role in human communication. Models and algorithms for automatic recognition of attitudes therefore may have applications in areas where successful communication and interaction are crucial, such as health-care, education and digital entertainment. This paper focuses on the task of categorizing speaker attitudes using speech features. Data extracted from video recordings are employed in training and testing of predictive models consisting of different sets of speech features. A novel attitude recognition approach using Multi-Resolution Cochleagram (MRCG) features is proposed. The results show that MRCG feature set outperforms the feature sets most commonly used in computational paralinguistic tasks, including emobase, eGeMAPS and ComParE, in terms of attitude recognition accuracy for decision tree, 1-nearest neighbour and random forest classifiers. Analysis of the results suggests that MRCG features contribute information not captured by these existing feature sets. Indeed, while the ComParE feature set provides slightly better results than MRCG features for support vector machine classifiers, the fusion of the existing feature sets with the new MRCG features improves on those results. Overall, with the addition of MRCG, the attitude recognition method proposed in this study achieves accuracy scores approximately 11 points higher than reported in previous studies.
引用
收藏
页码:3737 / 3741
页数:5
相关论文
共 50 条
  • [1] Affect Recognition through Scalogram and Multi-resolution Cochleagram Features
    Haider, Fasih
    Luz, Saturnino
    INTERSPEECH 2021, 2021, : 4478 - 4482
  • [2] Multi-resolution Analysis for Ear Recognition using Wavelet Features
    Shoaib, M.
    Basit, A.
    Faye, I.
    PROCEEDING OF THE 4TH INTERNATIONAL CONFERENCE OF FUNDAMENTAL AND APPLIED SCIENCES 2016 (ICFAS2016), 2016, 1787
  • [3] A Multi-resolution Action Recognition Algorithm using Wavelet Domain Features
    Imtiaz, Hafiz
    Mahbub, Upal
    Schaefer, Gerald
    Ahad, Md. Atiqur Rahman
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 537 - 541
  • [4] Face recognition using multi-resolution transform
    Arivazhagan, S.
    Mumtaj, J.
    Ganesan, L.
    ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL II, PROCEEDINGS, 2007, : 301 - +
  • [5] Iris recognition using multi-resolution transforms
    Arivazhagan, S.
    Ganesan, L.
    Srividya, T.
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2009, 1 (03) : 254 - 267
  • [6] Face recognition based on fusion of multi-resolution Gabor features
    Yong Xu
    Zhengming Li
    Jeng-Shyang Pan
    Jing-Yu Yang
    Neural Computing and Applications, 2013, 23 : 1251 - 1256
  • [7] Face recognition based on fusion of multi-resolution Gabor features
    Xu, Yong
    Li, Zhengming
    Pan, Jeng-Shyang
    Yang, Jing-Yu
    NEURAL COMPUTING & APPLICATIONS, 2013, 23 (05): : 1251 - 1256
  • [8] Learning Emotion Information for Expressive Speech Synthesis Using Multi-resolution Modulation-filtered Cochleagram
    Zhang, Kaili
    Unoki, Masashi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 227 - 233
  • [9] Old Handwritten Music Symbol Recognition Using Directional Multi-Resolution Spatial Features
    Nawade, Savitri Apparao
    Dhawale, Chitra
    Pardeshi, Rajmohan
    Hangarge, Mallikarjun
    Reaz, Mamun Bin Ibne
    Arsad, Norhana
    2018 INTERNATIONAL CONFERENCE ON SMART COMPUTING AND ELECTRONIC ENTERPRISE (ICSCEE), 2018,
  • [10] Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech
    Peng, Zhichao
    Dang, Jianwu
    Unoki, Masashi
    Akagi, Masato
    NEURAL NETWORKS, 2021, 140 (140) : 261 - 273