Multi-modal fusion learning through biosignal, audio, and visual content for detection of mental stress

被引:6
|
作者
Dogan, Gulin [1 ]
Akbulut, Fatma Patlar [2 ]
机构
[1] Istanbul Kultur Univ, Dept Comp Engn, TR-34158 Istanbul, Turkiye
[2] Istanbul Kultur Univ, Dept Software Engn, TR-34158 Istanbul, Turkiye
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 34期
关键词
Stress detection; Sequential and non-sequential model; Fine-tuning; Multi-modality; MOMENTARY ASSESSMENT; RECOGNITION; VOICE; FACE;
D O I
10.1007/s00521-023-09036-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mental stress is a significant risk factor for several maladies and can negatively impact a person's quality of life, including their work and personal relationships. Traditional methods of detecting mental stress through interviews and questionnaires may not capture individuals' instantaneous emotional responses. In this study, the method of experience sampling was used to analyze the participants' immediate affective responses, which provides a more comprehensive and dynamic understanding of the participants' experiences. WorkStress3D dataset was compiled using information gathered from 20 participants for three distinct modalities. During an average of one week, 175 h of data containing physiological signals such as BVP, EDA, and body temperature, as well as facial expressions and auditory data, were collected from a single subject. We present a novel fusion model that uses double-early fusion approaches to combine data from multiple modalities. The model's F1 score of 0.94 with a loss of 0.18 is very encouraging, showing that it can accurately identify and classify varying degrees of stress. Furthermore, we investigate the utilization of transfer learning techniques to improve the efficacy of our stress detection system. Despite our efforts, we were unable to attain better results than the fusion model. Transfer learning resulted in an accuracy of 0.93 and a loss of 0.17, illustrating the difficulty of adapting pre-trained models to the task of stress analysis. The results we obtained emphasize the significance of multi-modal fusion in stress detection and the importance of selecting the most suitable model architecture for the given task. The proposed fusion model demonstrates its potential for achieving an accurate and robust classification of stress. This research contributes to the field of stress analysis and contributes to the development of effective models for stress detection.
引用
收藏
页码:24435 / 24454
页数:20
相关论文
共 50 条
  • [41] Multi-modal depression detection based on emotional audio and evaluation text
    Ye, Jiayu
    Yu, Yanhong
    Wang, Qingxiang
    Li, Wentao
    Liang, Hu
    Zheng, Yunshao
    Fu, Gang
    JOURNAL OF AFFECTIVE DISORDERS, 2021, 295 : 904 - 913
  • [42] USING COMPRESSED AUDIO-VISUAL WORDS FOR MULTI-MODAL SCENE CLASSIFICATION
    Kurcius, Jan J.
    Breckon, Toby P.
    2014 INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE FOR MULTIMEDIA UNDERSTANDING (IWCIM), 2014,
  • [43] Using audio, visual, and lexical features in a multi-modal virtual meeting director
    Al-Hames, Marc
    Hoernler, Benedikt
    Scheuermann, Christoph
    Rigoll, Gerhard
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 63 - +
  • [44] Audio-visual flow - A variational approach to multi-modal flow estimation
    Hamid, R
    Bobick, A
    Yezzi, A
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2563 - 2566
  • [45] Audio-Visual Emotion Recognition System Using Multi-Modal Features
    Handa, Anand
    Agarwal, Rashi
    Kohli, Narendra
    INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (04)
  • [46] Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking
    Li, Yidi
    Liu, Hong
    Tang, Hao
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1456 - 1463
  • [47] Driver drowsiness detection using multi-modal sensor fusion
    Andreeva, E
    Aarabi, P
    Philiastides, MG
    Mohajer, K
    Emami, M
    MULTISENSOR, MULTISOURCE INFORMATION FUSION: ARCHITECTURES, ALGORITHMS, AND APPLICATONS 2004, 2004, 5434 : 380 - 390
  • [48] Multi-modal Fusion Network for Rumor Detection with Texts and Images
    Li, Boqun
    Qian, Zhong
    Li, Peifeng
    Zhu, Qiaoming
    MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 15 - 27
  • [49] Attention-based multi-modal fusion sarcasm detection
    Liu, Jing
    Tian, Shengwei
    Yu, Long
    Long, Jun
    Zhou, Tiejun
    Wang, Bo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2097 - 2108
  • [50] Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model
    Cai, Yitao
    Cai, Huiyu
    Wan, Xiaojun
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2506 - 2515