Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

被引：4

作者：

Ghoniem, Rania M. ^{[1
,2
]}

Algarni, Abeer D. ^{[2
]}

Shaalan, Khaled ^{[3
,4
]}

机构：

[1] Mansoura Univ, Dept Comp, Mansoura 35516, Egypt

[2] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, Riyadh 84428, Saudi Arabia

[3] British Univ Dubai, Fac Engn, Dubai 345015, U Arab Emirates

[4] British Univ Dubai, IT, Dubai 345015, U Arab Emirates

来源：

INFORMATION | 2019年 / 10卷 / 07期

关键词：

multi-modal emotion aware systems; speech processing; EEG signal processing; hybrid classification models; GENETIC ALGORITHM; NEURAL-NETWORKS; SENTIMENT ANALYSIS; FEATURE-SELECTION; RECOGNITION; EEG; CLASSIFICATION; FEATURES; MACHINE; MODEL;

D O I：

10.3390/info10070239

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several modalities may enhance the classification performance, they might exhibit high dimensionality and make the learning process complex for the most used machine learning algorithms. To overcome issues of feature extraction and multi-modal fusion, hybrid fuzzy-evolutionary computation methodologies are employed to demonstrate ultra-strong capability of learning features and dimensionality reduction. This paper proposes a novel multi-modal emotion aware system by fusing speech with EEG modalities. Firstly, a mixing feature set of speaker-dependent and independent characteristics is estimated from speech signal. Further, EEG is utilized as inner channel complementing speech for more authoritative recognition, by extracting multiple features belonging to time, frequency, and time-frequency. For classifying unimodal data of either speech or EEG, a hybrid fuzzy c-means-genetic algorithm-neural network model is proposed, where its fitness function finds the optimal fuzzy cluster number reducing the classification error. To fuse speech with EEG information, a separate classifier is used for each modality, then output is computed by integrating their posterior probabilities. Results show the superiority of the proposed model, where the overall performance in terms of accuracy average rates is 98.06%, and 97.28%, and 98.53% for EEG, speech, and multi-modal recognition, respectively. The proposed model is also applied to two public databases for speech and EEG, namely: SAVEE and MAHNOB, which achieve accuracies of 98.21% and 98.26%, respectively.

引用

页数：39

共 70 条

[1]

Abhang Priyanka A., 2015, British Journal of Applied Science Technology, V10, P1

[2]

Adeel A., 2019, INF FUSION

[3] New efficient hybrid candlestick technical analysis model for stock market timing on the basis of the Support Vector Machine and Heuristic Algorithms of Imperialist Competition and Genetic [J].

Ahmadi, Elham ;

Jasemi, Milad ;

Monplaisir, Leslie ;

Nabavi, Mohammad Amin ;

Mahmoodi, Armin ;

Jam, Pegah Amini .

EXPERT SYSTEMS WITH APPLICATIONS, 2018, 94 :21-31

[4] Anytime multipurpose emotion recognition from EEG data using a Liquid State Machine based framework [J].

Al Zoubi, Obada ;

Awada, Mariette ;

Kasabov, Nikola K. .

ARTIFICIAL INTELLIGENCE IN MEDICINE, 2018, 86 :1-8

[5] New approach in quantification of emotional intensity from the speech signal: emotional temperature [J].

Alonso, Jesus B. ;

Cabrera, Josue ;

Medina, Manuel ;

Travieso, Carlos M. .

EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (24) :9554-9564

[6]

BEZDEK JC, 1985, COMPUT GEOSCI, V11, P660, DOI 10.1016/0098-3004(85)90094-9

[7] Human emotion recognition and analysis in response to audio music using brain signals [J].

Bhatti, Adnan Mehmood ;

Majid, Muhammad ;

Anwar, Syed Muhammad ;

Khan, Bilal .

COMPUTERS IN HUMAN BEHAVIOR, 2016, 65 :267-275

[8]

Candra H, 2015, IEEE ENG MED BIO, P7250, DOI 10.1109/EMBC.2015.7320065

[9] Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech [J].

Cao, Houwei ;

Verma, Ragini ;

Nenkova, Ani .

COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01) :186-202

[10] EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis [J].

Delorme, A ;

Makeig, S .

JOURNAL OF NEUROSCIENCE METHODS, 2004, 134 (01) :9-21

← 1 2 3 4 5 6 7 →