Automatic Speech Recognition Models: A Characteristic and Performance Review

被引:0
作者
Patil, U. G. [1 ]
Shirbahadurkar, S. D. [2 ]
Paithane, A. N. [1 ]
机构
[1] SPPU Univ, JSPMs Rajarshi Shahu Coll Engn, Dept Elect & Telecommun, Pune, Maharashtra, India
[2] SPPU Univ, Dr DY Patil Coll Engn, Dept Elect & Telecommun, Pune, Maharashtra, India
来源
2016 INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA) | 2016年
关键词
speech; model; sparse; training; recognition; accuracy; Hindi; INVERSE COVARIANCE MATRICES; ACOUSTIC FACTOR-ANALYSIS; HIDDEN MARKOV-MODELS; SPEAKER IDENTIFICATION; LANGUAGE; FEATURES; SYSTEM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a review on few notable speech recognition models that are reported in the last decade. Firstly, the models are categorized into sparse models, learning models and domain - specific models. Subsequently, the characteristics of the models have been observed using speech constraints, algorithmic constraints and performance constraints. The performance of these models reported in the literature is investigated and the findings are summarized. Eventually, the research gaps revealed by the literature are discussed and the need for Hindi based speech recognition system is substantiated.
引用
收藏
页数:7
相关论文
共 67 条
[1]   Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition [J].
Akita, Yuya ;
Kawahara, Tatsuya .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06) :1539-1549
[2]  
[Anonymous], 2014, P 2014 IEEE INT C CO
[3]  
[Anonymous], 2015, 2015 INT C PERVASIVE, DOI DOI 10.1109/PERVASIVE.2015.7087042
[4]  
Anusuya M. A., 2010, INT C COMP APPL 24 2
[5]   Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition [J].
Arisoy, Ebru ;
Chen, Stanley F. ;
Ramabhadran, Bhuvana ;
Sethy, Abhinav .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) :184-192
[6]  
Arisoy E, 2013, INT CONF ACOUST SPEE, P8242, DOI 10.1109/ICASSP.2013.6639272
[7]   Updated MINDS Report on Speech Recognition and Understanding, Part 2 [J].
Baker, Janet M. ;
Deng, Li ;
Khudanpur, Sanjeev ;
Lee, Chin-Hui ;
Glass, James R. ;
Morgan, Nelson ;
O'Shaughnessy, Douglas .
IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (04) :78-85
[8]   Research Developments and Directions in Speech Recognition and Understanding, Part 1 [J].
Baker, Janet M. ;
Deng, Li ;
Glass, James ;
Khudanpur, Sanjeev ;
Lee, Chin-Hui ;
Morgan, Nelson ;
O'Shaughnessy, Douglas .
IEEE SIGNAL PROCESSING MAGAZINE, 2009, 26 (03) :75-80
[9]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[10]  
Das A, 2008, 2008 IFIP INTERNATIONAL CONFERENCE ON WIRELESS AND OPTICAL COMMUNICATIONS NETWORKS, P1