AUTOMATIC LANGUAGE ANALYSIS AND IDENTIFICATION BASED ON SPEECH PRODUCTION KNOWLEDGE

被引:6
作者
Sangwan, Abhijeet [1 ]
Mehrabani, Mahnoosh [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, Dept Elect Engn, Ctr Robust Speech Syst, Richardson, TX 75083 USA
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Phonological Features; Speech Production; Language Analysis; Language Identification;
D O I
10.1109/ICASSP.2010.5495066
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a language analysis and classification system that lever-ages knowledge of speech production is proposed. The proposed scheme automatically extracts key production traits (or "hot-spots") that are strongly tied to the underlying language structure. Particularly, the speech utterance is first parsed into consonant and vowel clusters. Subsequently, the production traits for each cluster is represented by the corresponding temporal evolution of speech articulatory states. It is hypothesized that a selection of these production traits are strongly tied to the underlying language, and can be exploited for language ID. The new scheme is evaluated on our South Indian Languages (SInL) corpus which consists of 5 closely related languages spoken in India, namely, Kannada, Tamil, Telegu, Malayalam, and Marathi. Good accuracy is achieved with a rate of 65% obtained in a difficult 5-way classification task with about 4sec of train and test speech data per utterance. Furthermore, the proposed scheme is also able to automatically identify key production traits of each language (e. g., dominant vowels, stop-consonants, fricatives etc.).
引用
收藏
页码:5006 / 5009
页数:4
相关论文
共 6 条
[1]  
Frankel J., 2007, Interspeech
[2]  
RATNAPARKHI, 1996, P EMNLP96
[3]  
SANGWAN A, 2009, INTERSPEECH 09
[4]  
SANGWAN A, 2009, ASRU
[5]  
SINGER E, 2003, EUROSPEECH03
[6]  
Talkin D., 1995, Speech coding and synthesis, V495, P518