A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

被引:32
作者
Das, Aankit [1 ]
Guha, Samarpan [1 ]
Singh, Pawan Kumar [2 ]
Ahmadian, Ali [3 ,4 ]
Senu, Norazak [4 ]
Sarkar, Ram [5 ]
机构
[1] Univ Calcutta, Inst Radio Phys & Elect, Kolkata 700009, India
[2] Jadavpur Univ, Dept Informat Technol, Kolkata 700106, India
[3] Natl Univ Malaysia UKM, Inst IR 4 0, Bangi 43600, Malaysia
[4] Univ Putra Malaysia, Inst Math Res, Serdang 43400, Malaysia
[5] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata 700032, India
关键词
Feature extraction; Mel frequency cepstral coefficient; Machine learning; Classification algorithms; Databases; Complexity theory; Discrete wavelet transforms; Spoken language identification; feature selection; binary Bat algorithm; late acceptance hill climbing algorithm; MFCC and LPC features; ALGORITHM; MFCC; OPTIMIZATION; RECOGNITION;
D O I
10.1109/ACCESS.2020.3028241
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the recent advancements in the fields of machine learning and artificial intelligence, spoken language identification-based applications have been increasing in terms of the impact they have on the day-to-day lives of common people. Western countries have been enjoying the privilege of spoken language recognition-based applications for a while now, however, they have not gained much popularity in multi-lingual countries like India owing to various complexities. In this paper, we have addressed this issue by attempting to identify different Indian languages based on various well-known features like Mel-Frequency Cepstral Coefficient (MFCC), Linear Prediction Coefficient (LPC), Discrete Wavelet Transform (DWT), Gammatone Frequency Cepstral Coefficient (GFCC) as well as a few deep learning architecture based features like i-vector and x-vector extracted from the audio signals. After comparing the initial results, it is observed that the combination of MFCC and LPC produces the best results. Then we have developed a new nature-inspired feature selection (FS) algorithm by hybridizing Binary Bat Algorithm (BBA) with Late Acceptance Hill-Climbing (LAHC) to select the optimal subset from the said feature vectors in order to reduce the model complexity and help it train faster. Using Random Forest (RF) classifier, we have achieved an accuracy of 92.35% on Indic TTS database developed by IIT-Madras, and an accuracy of 100% on the Indic Speech database developed by the Speech and Vision Laboratory (SVL) IIIT-Hyderabad. The proposed algorithm is also found to outperform many standard meta-heuristic FS algorithms. The source code of this work is available at: https://github.com/CodeChef97dotcom/Feature-Selection
引用
收藏
页码:181432 / 181449
页数:18
相关论文
共 61 条
[1]  
Ahmad AM, 2004, IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2, P98
[2]   Hybrid of Harmony Search Algorithm and Ring Theory-Based Evolutionary Algorithm for Feature Selection [J].
Ahmed, Shameem ;
Ghosh, Kushal Kanti ;
Singh, Pawan Kumar ;
Geem, Zong Woo ;
Sarkar, Ram .
IEEE ACCESS, 2020, 8 :102629-102645
[3]  
Aida-Zade KR, 2006, PROC WRLD ACAD SCI E, V13, P275
[4]   Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions [J].
Al-Ali, Ahmed Kamil Hasan ;
Dean, David ;
Senadji, Bouchra ;
Chandran, Vinod ;
Naik, Ganesh R. .
IEEE ACCESS, 2017, 5 :15400-15413
[5]  
[Anonymous], 2016, IEEE WCNC
[6]  
[Anonymous], 2019, ARXIV190504348
[7]  
[Anonymous], 2007, Information retrieval for music and motion
[8]  
[Anonymous], 2004, VARIATIONS
[9]  
[Anonymous], [No title captured]
[10]  
[Anonymous], 2010, INTRO MACHINE LEARNI