Optimization of stammering in speech recognition applications

被引:4
作者
Mishra, Nishant [1 ]
Gupta, Akash [1 ]
Vathana, D. [1 ]
机构
[1] SRM Inst Sci & Technol, Dept Comp Sci & Engn, Kattankulathur, India
关键词
Speech recognition; Speech classifier; Feature extraction; Stammer detection; Stammer removal;
D O I
10.1007/s10772-021-09828-w
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the field of speech recognition systems, existing work focuses only on the classification of speech into a stammering speech or a normal speech. Further processing is only one of the normal, use-able, speech. The goal of this research is to amend stammered speech and enable its general use. It is observed, that when a person stammers, there is a decrease in the amplitude of the person's voice, which can be used to eliminate the repetitions, elongations, and silent intervals. This allows the production of a better speech recognition system. Substantial improvements included in this research compared to previous implementations is coming up with a new deep-learning algorithm, which enhances speech recognition for people suffering from stammering. This is interfaced with a real-time web application. The procedure adopted is to first register the audio, remove stammers, and convert the processed data into recognized speech. Stammering is removed by using an amplitude threshold, which is obtained from a neural network model. The recognized speech is then prepared for further real-world usage.
引用
收藏
页码:679 / 685
页数:7
相关论文
共 18 条
[1]  
Awad M., 2015, Deep Neural Networks, P127, DOI DOI 10.1007/978-1-4302-5990-97
[2]  
Bhatia G., 2020, STUTTER DIAGNOSIS TH
[3]   Intelligent processing of stuttered speech [J].
Czyzewski, A ;
Kaczmarek, A ;
Kostek, B .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2003, 21 (02) :143-171
[4]  
De Silva Yasas, 2016, 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), P736, DOI 10.1109/SCOPES.2016.7955537
[5]   Speaker Recognition Using Neural Networks and Conventional Classifiers [J].
Farrell, Kevin R. ;
Mammone, Richard J. ;
Assaleh, Khaled T. .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :194-205
[6]  
Hammady Hosam, 2008, 2008 International Conference on Computer Engineering & Systems (ICCES '08), P125, DOI 10.1109/ICCES.2008.4772980
[7]   Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy [J].
Hariharan, M. ;
Fook, C. Y. ;
Sindhu, R. ;
Adom, Abdul Hamid ;
Yaacob, Sazali .
DIGITAL SIGNAL PROCESSING, 2013, 23 (03) :952-959
[8]   Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter .1. Psychometric procedures appropriate for selection of training material for lexical dysfluency classifiers [J].
Howell, P ;
Sackin, S ;
Glenn, K .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 1997, 40 (05) :1073-1084
[9]   The University College London Archive of Stuttered Speech (UCLASS) [J].
Howell, Peter ;
Davis, Stephen ;
Bartrip, Jon .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2009, 52 (02) :556-569
[10]   Support vector machine-based stuttering dysfluency classification using GMM supervectors [J].
Mahesha, P. ;
Vinod, D. S. .
INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2015, 6 (3-4) :143-149