Spectral Feature based Automatic Tonal and Non-Tonal Language Classification

被引:0
作者
Alphonsa, Alice Celin [1 ]
China , Chuya [1 ]
Laskar, Azharuddin [1 ]
Laskar, Rabul Hussain [1 ]
机构
[1] NIT Silchar, Elect & Commun Engn, Silchar, India
来源
2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT) | 2017年
关键词
Tonal/Non-tonal languages; MHEC; MFCC; SDC; Legendre polynomial; GMM-UBM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A Language Identification (LID) System finds out the language of a given speech utterance. Languages can be divided into tonal and non-tonal categories based on whether the meaning of the same word will change or not with the change in pitch variation. Classifying languages into tonal and non-tonal categories before the individual language identification stage will reduce the complexity of the LID system. Though state of the art systems use prosodic features for this purpose, this work is focused on analysing the performance of spectral features for tonal and non-tonal classification of languages. Performance analysis of different spectral feature combinations namely, Mel Frequency Cepstral Coefficients (MFCC), MFCC along with Shifted Delta Cepstral (SDC) Coefficients, Mean Hilbert Envelope Coefficients (MHEC) and MHEC along with SDC Coefficients is carried out in this study. Experiments have been performed on Oregon Graduate Institute-Multilingual Telephone Speech Corpus (OGI-MLTS) and NITS Language database using GMM-UBM modelling technique. Results show that MHEC with SDC and MFCC with SDC features, at syllabic level, give comparable performance of 33.97% Equal Error Rate (EER) for this classification task.
引用
收藏
页码:1271 / 1276
页数:6
相关论文
共 13 条
  • [1] Gonzalez D. R., 2009, P I IB SLTECH
  • [2] Kondoz A. M., 2004, DIGITAL SPEECH CODIN, P357
  • [3] Liberman M., 2014, INT C AC SPEECH SIGN
  • [4] Martinez D., 2013, INT C AC SPEECH SIGN
  • [5] Extraction and representation of prosodic features for language and speaker recognition
    Mary, Leena
    Yegnanarayana, B.
    [J]. SPEECH COMMUNICATION, 2008, 50 (10) : 782 - 796
  • [6] Muthusamy YK, 1992, P INT C SPOK LANG PR, DOI [10.1145/3018009.3018049, DOI 10.1145/3018009.3018049]
  • [7] Patterson R. D., 1988, APU Report 2341
  • [8] Reddy B. V. S., 2009, IEEE T AUDIO SPEECH, V17
  • [9] Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification
    Sadjadi, Seyed Omid
    Hansen, John H. L.
    [J]. SPEECH COMMUNICATION, 2015, 72 : 138 - 148
  • [10] Sadjadi SO, 2011, INT CONF ACOUST SPEE, P5448