Speech Emotion Recognition Using Unsupervised Feature Selection Algorithms

被引:9
作者
Bandela, Surekha Reddy [1 ]
Kumar, T. Kishore [1 ]
机构
[1] Natl Inst Technol Warangal, Dept ECE, Warangal, Andhra Pradesh, India
关键词
Speech Emotion Recognition (SER); INTERSPEECH Paralinguistic Feature Set; GTCC; feature selection; feature optimization; FSASL; UFSOL; SuFS;
D O I
10.13164/re.2020.0353
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The use of the combination of different speech features is a common practice to improve the accuracy of Speech Emotion Recognition (SER). Sometimes, this leads to an abrupt increase in the processing time and some of these features contribute less to emotion recognition often resulting in an incorrect prediction of emotion due to which the accuracy of the SER system decreases substantially. Hence, there is a need to select the appropriate feature set that can contribute significantly to emotion recognition. This paper presents the use of Feature Selection with Adaptive Structure Learning (FSASL) and Unsupervised Feature Selection with Ordinal Locality (UFSOL) algorithms for feature dimension reduction to improve SER performance with reduced feature dimension. A novel Subset Feature Selection (SuFS) algorithm is proposed to reduce further the feature dimension and achieve a comparable better accuracy when used along with the FSASL and UFSOL algorithms. 1582 INTERSPEECH 2010 Paralinguistic, 20 Gammatone Cepsfral Coefficients and Support Vector Machine classifier with 10-Fold Cross-Validation and Hold-Out Validation are considered in this work. The EMO-DB and IEMOCAP databases are used to evaluate the performance of the proposed SER system in terms of classification accuracy and computational time. From the result analysis, it is evident that the proposed SER system outperforms the existing ones.
引用
收藏
页码:353 / 364
页数:12
相关论文
共 25 条
[1]   Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection [J].
Ang, Jun Chin ;
Mirzal, Andri ;
Haron, Habibollah ;
Hamed, Haza Nuzly Abdull .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) :971-989
[2]   Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction [J].
Arruti, Andoni ;
Cearreta, Idoia ;
Alvarez, Aitor ;
Lazkano, Elena ;
Sierra, Basilio .
PLOS ONE, 2014, 9 (10)
[3]  
Burkhardt F., 2005, 9 EUROPEAN C SPEECH
[4]   IEMOCAP: interactive emotional dyadic motion capture database [J].
Busso, Carlos ;
Bulut, Murtaza ;
Lee, Chi-Chun ;
Kazemzadeh, Abe ;
Mower, Emily ;
Kim, Samuel ;
Chang, Jeannette N. ;
Lee, Sungbok ;
Narayanan, Shrikanth S. .
LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (04) :335-359
[5]  
Cheng SS, 2016, 2016 2ND INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC AND INFORMATION TECHNOLOGY ENGINEERING (ICMITE 2016), P1
[6]   Unsupervised Feature Selection with Adaptive Structure Learning [J].
Du, Liang ;
Shen, Yi-Dong .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :209-218
[7]   Survey on speech emotion recognition: Features, classification schemes, and databases [J].
El Ayadi, Moataz ;
Kamel, Mohamed S. ;
Karray, Fakhri .
PATTERN RECOGNITION, 2011, 44 (03) :572-587
[8]  
Eyben F., 2010, P ACM INT C MULT, P1459
[9]   Improving the performance of the speaker emotion recognition based on low dimension prosody features vector [J].
Gudmalwar, Ashishkumar Prabhakar ;
Rao, Ch V. Rama ;
Dutta, Anirban .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) :521-531
[10]  
Guo J, 2017, IEEE INT CON MULTI, P1213, DOI 10.1109/ICME.2017.8019357