Stochastic gradient descent analysis for the evaluation of a speaker recognition

被引:0
作者
Ashrf Nasef
Marina Marjanović-Jakovljević
Angelina Njeguš
机构
[1] Singidunum University,Faculty of Electrical Engineering and Computing
[2] Singidunum University,Faculty of Informatics and Computing
来源
Analog Integrated Circuits and Signal Processing | 2017年 / 90卷
关键词
Pattern recognition; Speech analysis; Deep learning Neural Network; Stochastic gradient descent; Learning rate; Dropout rate;
D O I
暂无
中图分类号
学科分类号
摘要
Performance optimization in speaker recognition is a challenging task in the field of vocal based human-computer interaction. Many researches have shown that deep learning Neural Network methods have the best performance in comparison with other classifiers. However, those methods with many parameters require a lot of tunings in order to optimize the performance in different supervised learning tasks. In this paper, we show that picking a good combination of parameters can significantly improve the performance of Stochastic Gradient Descent deep learning Neural Network method in automatic speaker recognition even in a noisy environment. Parameters that are analyzed are learning rate, hidden and input layer dropout rate.
引用
收藏
页码:389 / 397
页数:8
相关论文
共 53 条
[1]  
Rashmi CR(2014)Review of algorithms and applications in speech recognition system International Journal of Computer Science and Information Technologies 5 5258-5262
[2]  
Srinivas V(2014)Neural Network based classification for speaker identification International Journal of Signal Processing, Image Processing and Pattern Recognition 7 109-120
[3]  
Santhirani C(2009)Speech recognition by machine: A review International Journal of Computer Science and Information Security 6 181-205
[4]  
Madhu T(2012)Deep Neual Networks for acoustic modeling in speech recognition IEEE Signal Processing Magazine 29 82-97
[5]  
Anusuya MA(1997)Speaker recognition: A tutorial Proceedings of the IEEE 85 1437-1462
[6]  
Katti SK(2015)Deep Neural Network Approaches to speaker and language recognition IEEE Signal Processing Letters 22 1671-1675
[7]  
Hinton G(2006)Reducing the dimensionality of data with neural networks Science 313 504-507
[8]  
Deng L(2005)Modeling prosodic feature sequences for speaker recognition Speech Communication 46 455-472
[9]  
Yu D(2009)Prosodic and other long-term features for speaker diarization IEEE Transactions on Audio, Speech and Language Processing 17 985-993
[10]  
Dahl G(1980)Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Transactions on Acoustics, Speech, and Signal Processing 28 357-366