Speech Intention Classification with Multimodal Deep Learning

被引：28

作者：

Gu, Yue ^{[1
]}

Li, Xinyu ^{[1
]}

Chen, Shuhong ^{[1
]}

Zhang, Jianyu ^{[1
]}

Marsic, Ivan ^{[1
]}

机构：

[1] Rutgers State Univ, Dept Elect & Comp Engn, New Brunswick, NJ 08901 USA

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2017 | 2017年 / 10233卷

关键词：

Multimodal intention classification; Textual-acoustic feature representation; Convolutional neural network; Trauma resuscitation; EMOTION RECOGNITION;

D O I：

10.1007/978-3-319-57351-9_30

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a novel multimodal deep learning structure that automatically extracts features from textual-acoustic data for sentence-level speech classification. Textual and acoustic features were first extracted using two independent convolutional neural network structures, then combined into a joint representation, and finally fed into a decision softmax layer. We tested the proposed model in an actual medical setting, using speech recording and its transcribed log. Our model achieved 83.10% average accuracy in detecting 6 different intentions. We also found that our model using automatically extracted features for intention classification outperformed existing models that use manufactured features.

引用

页码：260 / 271

页数：12

共 50 条

[1] Multimodal deep representation learning for video classification
Haiman Tian
Yudong Tao
Samira Pouyanfar
Shu-Ching Chen
Mei-Ling Shyu
World Wide Web, 2019, 22 : 1325 - 1341
[2] Multimodal deep representation learning for video classification
Tian, Haiman
Tao, Yudong
Pouyanfar, Samira
Chen, Shu-Ching
Shyu, Mei-Ling
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (03): : 1325 - 1341
[3] Speech Emotion Classification Using Deep Learning
Mishra, Siba Prasad
Warule, Pankaj
Deb, Suman
PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 19 - 31
[4] Emotions Classification from Speech with Deep Learning
Chowanda, Andry
Muliono, Yohan
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 777 - 781
[5] HYDRA: A multimodal deep learning framework for malware classification
Gibert, Daniel
Mateu, Carles
Planes, Jordi
COMPUTERS & SECURITY, 2020, 95
[6] Multimodal skin lesion classification using deep learning
Yap, Jordan
Yolland, William
Tschandl, Philipp
EXPERIMENTAL DERMATOLOGY, 2018, 27 (11) : 1261 - 1267
[7] Ensemble Deep Learning for Sustainable Multimodal UAV Classification
McCoy, James
Rawal, Atul
Rawat, Danda B.
Sadler, Brian M.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 15425 - 15434
[8] EmbraceNet: A robust deep learning architecture for multimodal classification
Choi, Jun-Ho
Lee, Jong-Seok
INFORMATION FUSION, 2019, 51 : 259 - 270
[9] Multimodal deep learning for solar radio burst classification
Ma, Lin
Chen, Zhuo
Xu, Long
Yan, Yihua
PATTERN RECOGNITION, 2017, 61 : 573 - 582
[10] Deep Multimodal Learning: An Effective Method for Video Classification
Zhao, Tianqi
2019 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2019), 2019, : 398 - 402

← 1 2 3 4 5 →