Bottleneck Feature-Based Hybrid Deep Autoencoder Approach for Indian Language Identification

被引:6
作者
Das, Himanish Shekhar [1 ]
Roy, Pinki [1 ]
机构
[1] Natl Inst Technol Silchar, Dept Comp Sci & Engn, Silchar 788010, Assam, India
关键词
Speech processing; Language identification; Bottleneck feature; Deep autoencoder; Softmax regression; Jaya optimization; EXCITATION SOURCE INFORMATION; REPRESENTATION; CLASSIFICATION; SPEAKER; EXTRACTION; SPEECH; MODEL;
D O I
10.1007/s13369-020-04430-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Latest and emerging approaches are essential to resolve the communication barrier among different languages in speech processing. The automatic language identification system is developed to identify the spoken language from speech utterances. Feature selection is a very challenging task in language identification. In this paper, bottleneck feature-based hybrid deep autoencoder approach is proposed to identify the given speech signal with corresponding language features. In the proposed approach, initially Mel-frequency cepstral coefficients, linear prediction coefficients, and shifted delta coefficients features are directly extracted from multilingual speech utterances. Further, we extracted bottleneck feature from the bottleneck layer of the bottleneck deep neural network. Initially, recognition rate has been evaluated for each feature set to find out the best feature. Finally, the best feature along with other features is used as the input for deep autoencoder with softmax regression to identify the language based on class labels. The deep autoencoder is fine-tuned to reach the global optimum through Jaya optimization algorithm. To carry out the experiments, the recorded database is used for four Indian languages with special emphasis on northeastern languages. The experimental results demonstrate that the proposed hybrid approach using bottleneck feature with shifted delta coefficients is performing well with 97.10% accuracy. Moreover, experimental results also show that proposed hybrid approach gives superior outcome when compared with the baseline deep neural network-based approaches.
引用
收藏
页码:3425 / 3436
页数:12
相关论文
共 42 条
[1]   Automatic categorization of Arabic articles based on their political orientation [J].
Abooraig, Raddad ;
Al-Zu'bi, Shadi ;
Kanan, Tarek ;
Hawashin, Bilal ;
Al Ayoub, Mahmoud ;
Hmeidi, Ismail .
DIGITAL INVESTIGATION, 2018, 25 :24-41
[2]   Acoustic Feature Analysis and Discriminative Modeling for Language Identification of Closely Related South-Asian Languages [J].
Adeeba, Farah ;
Hussain, Sarmad .
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2018, 37 (08) :3589-3604
[3]   Spoken language identification based on the enhanced self-adjusting extreme learning machine approach [J].
Albadr, Musatafa Abbas Abbood ;
Tiun, Sabrina ;
AL-Dhief, Fahad Taha ;
Sammour, Mahmoud A. M. .
PLOS ONE, 2018, 13 (04)
[4]  
AlZu'bi S, 2018, 2018 FIFTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), P323, DOI 10.1109/SNAMS.2018.8554909
[5]   Language Identification: A Tutorial [J].
Ambikairajah, Eliathamby ;
Li, Haizhou ;
Wang, Liang ;
Yin, Bo ;
Sethu, Vidhyasaharan .
IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2011, 11 (02) :82-108
[6]  
Ben-Reuven E., 2016, ARXIV160400317
[7]  
Bhanja C.C., 2019, J KING SAUD U COMPUT
[8]   A Pre-classification-Based Language Identification for Northeast Indian Languages Using Prosody and Spectral Features [J].
Bhanja, Chuya China ;
Laskar, Mohammad Azharuddin ;
Laskar, Rabul Hussain .
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (05) :2266-2296
[9]  
Das H. S., 2019, INTELLIGENT SPEECH S, P81
[10]   Optimal prosodic feature extraction and classification in parametric excitation source information for Indian language identification using neural network based Q-learning algorithm [J].
Das, Himanish Shekhar ;
Roy, Pinki .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) :67-77