Native Language Identification in Very Short Utterances Using Bidirectional Long Short-Term Memory Network

被引:20
作者
Adeeba, Farah [1 ]
Hussain, Sarmad [2 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore 54890, Pakistan
[2] Univ Engn & Technol, Ctr Language Engn, Al Khwarizmi Inst Comp Sci, Lahore 54890, Pakistan
关键词
Native language identification; BLSTM; RNN; Urdu L2; PROSODIC FEATURES; SPEAKER;
D O I
10.1109/ACCESS.2019.2896453
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Native language identification (NLI) is the task of identifying the first language of a user based on their speech or written text in a second language. In this paper, we propose the use of spectrogram-and cochleagram-based features extracted from very short speech utterances (0.8 s on average) to infer the native language of an Urdu speaker. The bidirectional long short-term memory (BLSTM) neural networks are adopted for the classification of utterances among the native languages. A set of experiments is carried out for the network architecture search and the system's accuracy is evaluated on the validation data set. Overall accuracy of 74.81% and 71.61% is achieved using the Mel-frequency cepstral coefficients (MFCC) and Gammatone frequency cepstral coefficients (GFCC), respectively. Moreover, the optimized MFCC feature-based BLSTM network and GFCC feature-based BLSTM network are merged together to take advantage of both the feature sets. The experiments show that the performance of the merged network surpasses the individual BLSTM networks and accuracy of 75.69% is achieved on the evaluation data. The effect of test data duration is also analyzed (from 0.27 s to 1.5 s); in addition, it is observed that with very short duration as 0.4 s, an accuracy of over 50% can be achieved.
引用
收藏
页码:17098 / 17110
页数:13
相关论文
共 55 条
[1]  
[Anonymous], 2014, ODYSSEY SPEAK LANG L
[2]  
[Anonymous], P 1 IB SLTECH
[3]  
[Anonymous], 2010, P OD 2010
[4]  
[Anonymous], 2012, P INTERSPEECH
[5]  
[Anonymous], COMBINATION GENERATI
[6]  
[Anonymous], I15 ETS
[7]  
[Anonymous], 2012, IMPROVING NEURAL NET
[8]  
[Anonymous], 2014, 2014 INT C INF EL VI
[9]  
[Anonymous], P INTERSPEECH
[10]  
[Anonymous], 2017, P 4 IT C COMP LING C