Simplified design of the Speech Recognition System

被引:0
作者
Rozman, Robert [1 ]
机构
[1] Univ Ljubljani, Fak Racunalnistvo & Informatiko, Trzaska 25, Ljubljana 1000, Slovenia
来源
ELEKTROTEHNISKI VESTNIK-ELECTROCHEMICAL REVIEW | 2013年 / 80卷 / 04期
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Disadvantages of the currently used Speech Recognition Systems (SRSs) and alternative ways of their evolution are presented and discussed. In our opinion, SRSs are rather static structures with a lot of predefined knowledge that is built into them upon their creation, and usually remaining unchanged or unadapted during the recognition process. Several possible ways of increasing the amount of the dynamic, automatically learned knowledge in the next generation of SRSs are discussed; this is of a particular importance for under-resourced languages. A group of SRSs, i.e. compact SRSs with a limited vocabulary based on the Neural Network as an acoustic model, is of a particular interest. Its structure is more compatible with the recent developments in the field of distributed and parallel processing. Two experimental systems are presented and tested on a simple phoneme recognition task. One system is a fairly complete SRS based on the Neural Network as an acoustic model and Viterbi search as a time model. The second system is much simpler using only the Neural Network as an acoustic model. It will support our further research in this field.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 8 条
[1]   A segmental non-parametric-based phoneme recognition approach at the acoustical level [J].
Golipour, Ladan ;
O'Shaughnessy, Douglas .
COMPUTER SPEECH AND LANGUAGE, 2012, 26 (04) :244-259
[2]  
Hinton G., 2012, IEEE SIGNAL PROCESSI, V29
[3]  
Hu ZH, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P1117, DOI 10.1109/ICSLP.1996.607802
[4]  
Rozman R, 2003, IEEE REGION 8 EUROCON 2003, VOL B, PROCEEDINGS, P171
[5]   Using asymmetric windows in automatic speech recognition [J].
Rozman, Robert ;
Kodek, Dusan M. .
SPEECH COMMUNICATION, 2007, 49 (04) :268-276
[6]  
ROZMAN Robert, 2005, NESIMETRICNE OKENSKE, VVI, P128
[7]  
ROZMAN Robert, 2001, ZBORN DES EL RAC K E, P257
[8]  
Zue V., 1989, Speech and Natural Language. Proceedings of a Workshop, P179