A lazy learning-based language identification from speech using MFCC-2 features

被引:0
|
作者
Himadri Mukherjee
Sk Md Obaidullah
K. C. Santosh
Santanu Phadikar
Kaushik Roy
机构
[1] West Bengal State University,Department of Computer Science
[2] Aliah University,Department of Computer Science and Engineering
[3] The University of South Dakota,Department of Computer Science
[4] Maulana Abul Kalam Azad University of Technology,Department of Computer Science and Engineering
关键词
Lazy learning; Speech recognition; Language identification; Mel frequency cepstral coefficient-based features;
D O I
暂无
中图分类号
学科分类号
摘要
Developing an automatic speech recognition system for multilingual countries like India is a challenging task due to the fact that the people are inured to using multiple languages while talking. This makes language identification from speech an important and essential task prior to recognition of the same. In this paper a system is proposed towards language identification from multilingual speech signals. A new second level Mel frequency cepstral coefficient-based feature named MFCC-2 that handles the large and uneven dimensionality of MFCC has been used to characterize languages in the thick of English, Bangla and Hindi. The system has been tested with recordings of as many as 12,000 utterances of numerals and 41,884 clips extracted from YouTube videos considering background music, data from multiple environments, avoidance of noise suppression and use of keywords from different languages in a single phrase. The highest and average accuracies (for Top-3 classifiers from a pool of nine classifiers) of 98.09% and 95.54%, respectively were achieved for YouTube data.
引用
收藏
页码:1 / 14
页数:13
相关论文
共 50 条
  • [1] A lazy learning-based language identification from speech using MFCC-2 features
    Mukherjee, Himadri
    Obaidullah, Sk Md
    Santosh, K. C.
    Phadikar, Santanu
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (01) : 1 - 14
  • [2] Language Identification System using MFCC and Prosodic Features
    Bhattacharjee, Utpal
    KshirodSarmah
    2013 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING (ISSP), 2013, : 194 - 197
  • [3] Automatic spoken language identification using MFCC based time series features
    Mainak Biswas
    Saif Rahaman
    Ali Ahmadian
    Kamalularifin Subari
    Pawan Kumar Singh
    Multimedia Tools and Applications, 2023, 82 : 9565 - 9595
  • [4] Automatic spoken language identification using MFCC based time series features
    Biswas, Mainak
    Rahaman, Saif
    Ahmadian, Ali
    Subari, Kamalularifin
    Singh, Pawan Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (07) : 9565 - 9595
  • [5] Neural Network Based Recognition of Speech Using MFCC Features
    Barua, Pialy
    Ahmad, Kanij
    Khan, Ainul Anam Shahjamal
    Sanaullah, Muhammad
    2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
  • [6] Language Identification From Speech Features Using SVM and LDA
    Anjana, J. S.
    Poorna, S. S.
    2018 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2018,
  • [7] Parkinson disease prediction using machine learning-based features from speech signal
    Linlin Yuan
    Yao Liu
    Hsuan-Ming Feng
    Service Oriented Computing and Applications, 2024, 18 : 101 - 107
  • [8] Parkinson disease prediction using machine learning-based features from speech signal
    Yuan, Linlin
    Liu, Yao
    Feng, Hsuan-Ming
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2024, 18 (01) : 101 - 107
  • [9] Speaker identification and localization using shuffled MFCC features and deep learning
    Barhoush M.
    Hallawa A.
    Schmeink A.
    International Journal of Speech Technology, 2023, 26 (01) : 185 - 196
  • [10] Deep Learning Based Language Identification System From Speech
    Athira, N. P.
    Poorna, S. S.
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1094 - 1097