A lazy learning-based language identification from speech using MFCC-2 features

被引:0
|
作者
Himadri Mukherjee
Sk Md Obaidullah
K. C. Santosh
Santanu Phadikar
Kaushik Roy
机构
[1] West Bengal State University,Department of Computer Science
[2] Aliah University,Department of Computer Science and Engineering
[3] The University of South Dakota,Department of Computer Science
[4] Maulana Abul Kalam Azad University of Technology,Department of Computer Science and Engineering
关键词
Lazy learning; Speech recognition; Language identification; Mel frequency cepstral coefficient-based features;
D O I
暂无
中图分类号
学科分类号
摘要
Developing an automatic speech recognition system for multilingual countries like India is a challenging task due to the fact that the people are inured to using multiple languages while talking. This makes language identification from speech an important and essential task prior to recognition of the same. In this paper a system is proposed towards language identification from multilingual speech signals. A new second level Mel frequency cepstral coefficient-based feature named MFCC-2 that handles the large and uneven dimensionality of MFCC has been used to characterize languages in the thick of English, Bangla and Hindi. The system has been tested with recordings of as many as 12,000 utterances of numerals and 41,884 clips extracted from YouTube videos considering background music, data from multiple environments, avoidance of noise suppression and use of keywords from different languages in a single phrase. The highest and average accuracies (for Top-3 classifiers from a pool of nine classifiers) of 98.09% and 95.54%, respectively were achieved for YouTube data.
引用
收藏
页码:1 / 14
页数:13
相关论文
共 50 条
  • [31] LEARNING-BASED PERSONAL SPEECH ENHANCEMENT FOR TELECONFERENCING BY EXPLOITING SPATIAL-SPECTRAL FEATURES
    Hsu, Yicheng
    Lee, Yonghan
    Bai, Mingsian R.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8787 - 8791
  • [32] A Stacked Sparse Autoencoder based Architecture for Punjabi and English Spoken Language Classification using MFCC features
    Arora, Vaibhav
    Sood, Pulkit
    Keshari, Kumar Utkarsh
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 269 - 272
  • [33] Effective Learning-Based Illuminant Estimation Using Simple Features
    Cheng, Dongliang
    Price, Brian
    Cohen, Scott
    Brown, Michael S.
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 1000 - 1008
  • [34] Deep Learning-Based End-to-End Speaker Identification Using Time–Frequency Representation of Speech Signal
    Banala Saritha
    Mohammad Azharuddin Laskar
    Anish Monsley Kirupakaran
    Rabul Hussain Laskar
    Madhuchhanda Choudhury
    Nirupam Shome
    Circuits, Systems, and Signal Processing, 2024, 43 : 1839 - 1861
  • [35] Improving wav2vec2-based Spoken Language Identification by Learning Phonological Features
    Shahin, Mostafa
    Nan, Zheng
    Sethu, Vidhyasaharan
    Ahmed, Beena
    INTERSPEECH 2023, 2023, : 4119 - 4123
  • [36] EEG based direct speech BCI system using a fusion of SMRT and MFCC/ LPCC features with ANN classifier
    Mini, P. P.
    Thomas, Tessamma
    Gopikakumari, R.
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 68
  • [37] Deep Learning Bidirectional LSTM based Detection of Prolongation and Repetition in Stuttered Speech using Weighted MFCC
    Gupta, Sakshi
    Shukla, Ravi S.
    Shukla, Rajesh K.
    Verma, Rajesh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (09) : 345 - 356
  • [38] Deep learning-based appearance features extraction for automated carp species identification
    Banan, Ashkan
    Nasiri, Amin
    Taheri-Garavand, Amin
    AQUACULTURAL ENGINEERING, 2020, 89
  • [39] Language Discrimination from Speech Signal Using Perceptual and Physical Features
    Yasmin, Ghazaala
    DasGupta, Ishani
    Das, Asit K.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, 2019, 711 : 357 - 367
  • [40] Identification of the Facial Features of Patients With Cancer: A Deep Learning-Based Pilot Study
    Liang, Bin
    Yang, Na
    He, Guosheng
    Huang, Peng
    Yang, Yong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (04)