Speech Emotion Recognition Using Deep Learning LSTM for Tamil Language

被引：4

作者：

Fernandes, Bennilo ^{[1
]}

Mannepalli, Kasiprasad ^{[1
]}

机构：

[1] Koneru Lakshmaiah Educ Fdn, Dept Elect & Commun Engn, Guntur, Andhra Pradesh, India

来源：

PERTANIKA JOURNAL OF SCIENCE AND TECHNOLOGY | 2021年 / 29卷 / 03期

关键词：

BiLSTM; DNN; Emotional Recognition; LSTM; RNN; CONVOLUTIONAL NEURAL-NETWORK;

D O I：

10.47836/pjst.29.3.33

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Deep Neural Networks (DNN) are more than just neural networks with several hidden units that gives better results with classification algorithm in automated voice recognition activities. Then spatial correlation was considered in traditional feedforward neural networks and which do not manage speech signal properly to it extend, so recurrent neural networks (RNNs) were implemented. Long Short-Term Memory (LSTM) systems is a unique case of RNNs for speech processing, thus considering long-term dependencies Deep Hierarchical LSTM and BiLSTM is designed with dropout layers to reduce the gradient and long-term learning error in emotional speech analysis. Thus, four different combinations of deep hierarchical learning architecture Deep Hierarchical LSTM and LSTM (DHLL), Deep Hierarchical LSTM and BiLSTM (DHLB), Deep Hierarchical BiLSTM and LSTM (DHBL) and Deep Hierarchical dual BiLSTM (DHBB) is designed with dropout layers to improve the networks. The performance test of all four model were compared in this paper and better efficiency of classification is attained with minimal dataset of Tamil Language. The experimental results show that DHLB reaches the best precision of about 84% in recognition of emotions for Tamil database, however, the DHBL gives 83% of efficiency. Other design layers also show equal performance but less than the above models DHLL & DHBB shows 81% of efficiency for lesser dataset and minimal execution and training time.

引用

页码：1915 / 1936

页数：22

共 34 条

[1] Deep features-based speech emotion recognition for smart affective services [J].

Badshah, Abdul Malik ;

Rahim, Nasir ;

Ullah, Noor ;

Ahmad, Jamil ;

Muhammad, Khan ;

Lee, Mi Young ;

Kwon, Soonil ;

Baik, Sung Wook .

MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (05) :5571-5589

[2] An Image-based Deep Spectrum Feature Representation for the Recognition of Emotional Speech [J].

Cummins, Nicholas ;

Amiriparian, Shahin ;

Hagerer, Gerhard ;

Batliner, Anton ;

Steidl, Stefan ;

Schuller, Bjorn W. .

PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :478-484

[3] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[4] ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network [J].

Huang, Jingshan ;

Chen, Binqiang ;

Yao, Bin ;

He, Wangpeng .

IEEE ACCESS, 2019, 7 :92871-92880

[5] Cloud-Assisted Multiview Video Summarization Using CNN and Bidirectional LSTM [J].

Hussain, Tanveer ;

Muhammad, Khan ;

Ullah, Amin ;

Cao, Zehong ;

Baik, Sung Wook ;

de Albuquerque, Victor Hugo C. .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (01) :77-86

[6] Memento: An Emotion-driven Lifelogging System with Wearables [J].

Jiang, Shiqi ;

Li, Zhenjiang ;

Zhou, Pengfei ;

Li, Mo .

ACM TRANSACTIONS ON SENSOR NETWORKS, 2019, 15 (01)

[7] Insights Into LSTM Fully Convolutional Networks for Time Series Classification [J].

Karim, Fazle ;

Majumdar, Somshubra ;

Darabi, Houshang .

IEEE ACCESS, 2019, 7 :67718-67725

[8] Speech Emotion Recognition Using Deep Learning Techniques: A Review [J].

Khalil, Ruhul Amin ;

Jones, Edward ;

Babar, Mohammad Inayatullah ;

Jan, Tariqullah ;

Zafar, Mohammad Haseeb ;

Alhussain, Thamer .

IEEE ACCESS, 2019, 7 :117327-117345

[9] Sound Classification Using Convolutional Neural Network and Tensor Deep Stacking Network [J].

Khamparia, Aditya ;

Gupta, Deepak ;

Nhu Gia Nguyen ;

Khanna, Ashish ;

Pandey, Babita ;

Tiwari, Prayag .

IEEE ACCESS, 2019, 7 :7717-7727

[10] Cover the Violence: A Novel Deep-Learning-Based Approach Towards Violence-Detection in Movies [J].

Khan, Samee Ullah ;

Ul Haq, Ijaz ;

Rho, Seungmin ;

Baik, Sung Wook ;

Lee, Mi Young .

APPLIED SCIENCES-BASEL, 2019, 9 (22)

← 1 2 3 4 →