Speech Intention Classification with Multimodal Deep Learning

被引:28
|
作者
Gu, Yue [1 ]
Li, Xinyu [1 ]
Chen, Shuhong [1 ]
Zhang, Jianyu [1 ]
Marsic, Ivan [1 ]
机构
[1] Rutgers State Univ, Dept Elect & Comp Engn, New Brunswick, NJ 08901 USA
来源
ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2017 | 2017年 / 10233卷
关键词
Multimodal intention classification; Textual-acoustic feature representation; Convolutional neural network; Trauma resuscitation; EMOTION RECOGNITION;
D O I
10.1007/978-3-319-57351-9_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel multimodal deep learning structure that automatically extracts features from textual-acoustic data for sentence-level speech classification. Textual and acoustic features were first extracted using two independent convolutional neural network structures, then combined into a joint representation, and finally fed into a decision softmax layer. We tested the proposed model in an actual medical setting, using speech recording and its transcribed log. Our model achieved 83.10% average accuracy in detecting 6 different intentions. We also found that our model using automatically extracted features for intention classification outperformed existing models that use manufactured features.
引用
收藏
页码:260 / 271
页数:12
相关论文
共 50 条
  • [1] Multimodal deep representation learning for video classification
    Haiman Tian
    Yudong Tao
    Samira Pouyanfar
    Shu-Ching Chen
    Mei-Ling Shyu
    World Wide Web, 2019, 22 : 1325 - 1341
  • [2] Multimodal deep representation learning for video classification
    Tian, Haiman
    Tao, Yudong
    Pouyanfar, Samira
    Chen, Shu-Ching
    Shyu, Mei-Ling
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (03): : 1325 - 1341
  • [3] Speech Emotion Classification Using Deep Learning
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 19 - 31
  • [4] Emotions Classification from Speech with Deep Learning
    Chowanda, Andry
    Muliono, Yohan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 777 - 781
  • [5] HYDRA: A multimodal deep learning framework for malware classification
    Gibert, Daniel
    Mateu, Carles
    Planes, Jordi
    COMPUTERS & SECURITY, 2020, 95
  • [6] Multimodal skin lesion classification using deep learning
    Yap, Jordan
    Yolland, William
    Tschandl, Philipp
    EXPERIMENTAL DERMATOLOGY, 2018, 27 (11) : 1261 - 1267
  • [7] Ensemble Deep Learning for Sustainable Multimodal UAV Classification
    McCoy, James
    Rawal, Atul
    Rawat, Danda B.
    Sadler, Brian M.
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 15425 - 15434
  • [8] EmbraceNet: A robust deep learning architecture for multimodal classification
    Choi, Jun-Ho
    Lee, Jong-Seok
    INFORMATION FUSION, 2019, 51 : 259 - 270
  • [9] Multimodal deep learning for solar radio burst classification
    Ma, Lin
    Chen, Zhuo
    Xu, Long
    Yan, Yihua
    PATTERN RECOGNITION, 2017, 61 : 573 - 582
  • [10] Deep Multimodal Learning: An Effective Method for Video Classification
    Zhao, Tianqi
    2019 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2019), 2019, : 398 - 402