An effective gender recognition approach using voice data via deeper LSTM networks

被引:51
作者
Ertam, Fatih [1 ]
机构
[1] Firat Univ, Technol Fac, Dept Digital Forens Engn, Elazig, Turkey
关键词
Gender recognition; Gender classification; Deep learning; Deeper LSTM; Machine learning; SPEAKERS AGE; CLASSIFICATION; FRAMEWORK;
D O I
10.1016/j.apacoust.2019.07.033
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It is not difficult to estimate the gender of the human from other people's audio files. In general, people can easily identify the gender of the owner of a conversation with the experience they have acquired. However, it is not easy to predict whether a person is a man or a woman by computer systems. Hence, many papers and proposals have been presented to solve this problem using computer systems. In this study, Deeper Long Short Term Memory (LSTM) Networks structure was used for the prediction of gender from an audio data set. The study was successful at predicting gender with an accuracy of 98.4%. The proposed approach consists of 3 main steps. Firstly, 10 most effective data attributes were selected (i). Then, a deep learning-based network was created with the double-layer LSTM structure (ii). In addition to the performance comparison of the classification, accuracy values, sensitivity, and specificity performance metrics were also calculated (iii). At the same time, the accuracy of the proposed method was compared with the accuracy values obtained from the classifiers generated by conventional machine learning approaches. The study was successful at predicting gender with 98.4% success rate. It is thought that the study will be a pioneer in this field as an effective and fast approach for gender recognition. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:351 / 358
页数:8
相关论文
共 33 条
[31]   Efficient kNN Classification With Different Numbers of Nearest Neighbors [J].
Zhang, Shichao ;
Li, Xuelong ;
Zong, Ming ;
Zhu, Xiaofeng ;
Wang, Ruili .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) :1774-1785
[32]   Gender Classification in Children Based on Speech Characteristics: Using Fundamental and Formant Frequencies of Malay Vowels [J].
Zourmand, Alireza ;
Ting, Hua-Nong ;
Mirhassani, Seyed Mostafa .
JOURNAL OF VOICE, 2013, 27 (02) :201-209
[33]  
Zvarevashe K, 2018, 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS (ICABCD)