Applying a Convolutional Vision Transformer for Emotion Recognition in Children with Autism: Fusion of Facial Expressions and Speech Features

被引:0
|
作者
Wang, Yonggu [1 ]
Pan, Kailin [1 ]
Shao, Yifan [1 ]
Ma, Jiarong [1 ]
Li, Xiaojuan [2 ]
机构
[1] Zhejiang Univ Technol, Coll Educ, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ Finance & Econ, Mental Hlth Educ Ctr, Hangzhou 310018, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 06期
基金
中国国家自然科学基金;
关键词
emotion recognition; multimodal feature fusion; deep learning; children with autism;
D O I
10.3390/app15063083
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
With advances in digital technology, including deep learning and big data analytics, new methods have been developed for autism diagnosis and intervention. Emotion recognition and the detection of autism in children are prominent subjects in autism research. Typically using single-modal data to analyze the emotional states of children with autism, previous research has found that the accuracy of recognition algorithms must be improved. Our study creates datasets on the facial and speech emotions of children with autism in their natural states. A convolutional vision transformer-based emotion recognition model is constructed for the two distinct datasets. The findings indicate that the model achieves accuracies of 79.12% and 83.47% for facial expression recognition and Mel spectrogram recognition, respectively. Consequently, we propose a multimodal data fusion strategy for emotion recognition and construct a feature fusion model based on an attention mechanism, which attains a recognition accuracy of 90.73%. Ultimately, by using gradient-weighted class activation mapping, a prediction heat map is produced to visualize facial expressions and speech features under four emotional states. This study offers a technical direction for the use of intelligent perception technology in the realm of special education and enriches the theory of emotional intelligence perception of children with autism.
引用
收藏
页数:35
相关论文
共 50 条
  • [21] Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion
    Rashid Jahangir
    Ying Wah Teh
    Ghulam Mujtaba
    Roobaea Alroobaea
    Zahid Hussain Shaikh
    Ihsan Ali
    Machine Vision and Applications, 2022, 33
  • [22] Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion
    Jahangir, Rashid
    Teh, Ying Wah
    Mujtaba, Ghulam
    Alroobaea, Roobaea
    Shaikh, Zahid Hussain
    Ali, Ihsan
    MACHINE VISION AND APPLICATIONS, 2022, 33 (03)
  • [23] Novel 1D and 2D Convolutional Neural Networks for Facial and Speech Emotion Recognition
    Bodavarapu, Pavan Nageswar Reddy
    Reddy, B. Gowtham Kumar
    Srinivas, P. V. V. S.
    THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND CAPSULE NETWORKS (ICIPCN 2022), 2022, 514 : 374 - 384
  • [24] ADIEU FEATURES? END-TO-END SPEECH EMOTION RECOGNITION USING A DEEP CONVOLUTIONAL RECURRENT NETWORK
    Trigeorgis, George
    Ringeval, Fabien
    Brueckner, Raymond
    Marchi, Erik
    Nicolaou, Mihalis A.
    Shuller, Bjoern
    Zafeiriou, Stefanos
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5200 - 5204
  • [25] A Multimodal Facial Emotion Recognition Framework through the Fusion of Speech with Visible and Infrared Images
    Siddiqui, Mohammad Faridul Haque
    Javaid, Ahmad Y.
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2020, 4 (03) : 1 - 21
  • [26] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    ELECTRONICS, 2023, 12 (04)
  • [27] Emotion Recognition System via Facial Expressions and Speech Using Machine Learning and Deep Learning Techniques
    Chaudhari A.
    Bhatt C.
    Nguyen T.T.
    Patel N.
    Chavda K.
    Sarda K.
    SN Computer Science, 4 (4)
  • [28] A Feature-fusion-based Convolutional Neuro-fuzzy Classifier for Facial Emotion Recognition
    Lin, Cheng-Jian
    Lin, Xue-Qian
    SENSORS AND MATERIALS, 2024, 36 (11) : 4927 - 4938
  • [29] Effects of Emotional Music on Facial Emotion Recognition in Children with Autism Spectrum Disorder (ASD)
    Gary L. Wagener
    Madeleine Berning
    Andreia P. Costa
    Georges Steffgen
    André Melzer
    Journal of Autism and Developmental Disorders, 2021, 51 : 3256 - 3265
  • [30] Effects of Emotional Music on Facial Emotion Recognition in Children with Autism Spectrum Disorder (ASD)
    Wagener, Gary L.
    Berning, Madeleine
    Costa, Andreia P.
    Steffgen, Georges
    Melzer, Andre
    JOURNAL OF AUTISM AND DEVELOPMENTAL DISORDERS, 2021, 51 (09) : 3256 - 3265