Pattern recognition and features selection for speech emotion recognition model using deep learning

被引：1

作者：

Kittisak Jermsittiparsert

Abdurrahman Abdurrahman

Parinya Siriattakul

Ludmila A. Sundeeva

Wahidah Hashim

Robbi Rahim

Andino Maseleno

机构：

[1] Ton Duc Thang University,Physics Education Department

[2] Lampung University,School of Psychology

[3] University of Queensland,Institute of Informatics and Computing Energy

[4] Togliatti State University,Department of Information Systems

[5] Universiti Tenaga Nasional,undefined

[6] Sekolah Tinggi Ilmu Manajemen sukma,undefined

[7] STMIK Pringsewu,undefined

来源：

International Journal of Speech Technology | 2020年 / 23卷

关键词：

Deep learning; Speech; Emotion recognition; Feature extraction;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Automatic speaker recognizing models consists of a foundation on building various models of speaker characterization, pattern analyzing and engineering. The effect of classification and feature selection methods for the speech emotion recognition is focused. The process of selecting the exact parameter in arrangement with the classifier is an important part of minimizing the difficulty of system computing. This process becomes essential particularly for the models which undergo deployment in real time scenario. In this paper, a new deep learning speech based recognition model is presented for automatically recognizes the speech words. The superiority of an input source, i.e. speech sound in this state has straight impact on a classifier correctness attaining process. The Berlin database consist around 500 demonstrations to media persons that is both male and female. On the applied dataset, the presented model achieves a maximum accuracy of 94.21%, 83.54%, 83.65% and 78.13% under MFCC, prosodic, LSP and LPC features. The presented model offered better recognition performance over the other methods.

引用

页码：799 / 806

页数：7

共 40 条

[1] Cakır E(2017)Convolutional recurrent neural networks for polyphonic sound event detection IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 1291-1303
[2] Parascandolo G(2011)Survey on speech emotion recognition: Features, classification schemes, and databases Pattern Recognition 44 572-587
[3] Heittola T(2012)Emotion recognition from speech: A review International Journal of Speech Technology 15 99-117
[4] Huttunen H(2019)Deep learning model for real-time image compression in Internet of Underwater Things (IoUT) Journal of Real-Time Image Processing 81 105487-382
[5] Virtanen T(2019)Online clinical decision support system using optimal deep neural networks Applied Soft Computing 92 374-70
[6] El Ayadi M(2019)Optimal deep learning model for classification of lung cancer on CT images Future Generation Computer Systems 521 436-275
[7] Kamel MS(2015)Deep learning Nature 10 66-117
[8] Karray F(2019)Image classification using deep neural networks for malaria disease detection International Journal on Emerging Technologies 10 270-undefined
[9] Koolagudi SG(2012)Fundamental frequency extraction method using central clipping and its importance for the classification of emotional state Advances in Electrical and Electronic Engineering 8 114-undefined
[10] Rao KS(2010)Speech quality monitoring in Czech national research network Advances in Electrical and Electronic Engineering undefined undefined-undefined

← 1 2 3 4 →