Feature Extraction Methods in Language Identification: A Survey

被引:26
作者
Deshwal, Deepti [1 ,2 ]
Sangwan, Pardeep [1 ]
Kumar, Divya [2 ]
机构
[1] Maharaja Surajmal Inst Technol, Dept ECE, New Delhi, India
[2] IFTM Univ, Dept ECE, Moradabad, Uttar Pradesh, India
关键词
Feature extraction; Language identification; Noise compensation; Mel-frequency cepstral coefficients (MFCC); Shifted delta cepstral coefficients (SDCs); AUTOMATIC SPEECH RECOGNITION; FRONT-END; SPECTRAL FEATURES; NEURAL-NETWORKS; ENHANCEMENT; SPEAKER; ROBUST; NOISE; CLASSIFICATION; SYSTEM;
D O I
10.1007/s11277-019-06373-3
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Language Identification (LI) is one of the widely emerging field in the areas of speech processing to accurately identify the language from the data base based on some features of the speech signal. LI technologies have a wide set of applications in different spheres due to the growing advancement in the field of artificial intelligence and machine learning. Feature extraction is one of the fundamental and significant process performed in LI. This review presents main paradigms of research in Feature Extraction methods that will provide a deep insight to the researchers about the feature extraction techniques for future studies in LI. Broadly, this review summarizes and compare various feature extraction approaches with and without noise compensation techniques as the current trend is towards robust universal Language Identification framework. This paper categorizes the different feature extraction approaches on the basis of different features, human speech production system/peripheral auditory system, spectral or cepstral analysis, and lastly on the basis of transform. Moreover, the different noise compensation-based feature extraction techniques are also covered in the review. This paper also presents, that Mel-Frequency Cepstral Coefficients (MFCCs) are the most popular approach. Results indicates that MFCC fused with other feature vectors and cleansing approaches gives improved performance as compared to the pure MFCC based Feature Extraction approaches. This study also describes the different categories at the front end of the LI system from research point of view.
引用
收藏
页码:2071 / 2103
页数:33
相关论文
共 112 条
  • [1] Speech enhancement with an adaptive Wiener filter
    Abd El-Fattah, Marwa
    Dessouky, Moawad
    Abbas, Alaa
    Diab, Salaheldin
    El-Rabaie, El-Sayed
    Al-Nuaimy, Waleed
    Alshebeili, Saleh
    Abd El-Samie, Fathi
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (01) : 53 - 64
  • [2] Aggarwal G., 2018, INT J COMPUTERS APPL, P1
  • [3] Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR
    Agrawal, Purvi
    Ganapathy, Sriram
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2446 - 2450
  • [4] Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions
    Al-Ali, Ahmed Kamil Hasan
    Dean, David
    Senadji, Bouchra
    Chandran, Vinod
    Naik, Ganesh R.
    [J]. IEEE ACCESS, 2017, 5 : 15400 - 15413
  • [5] Automatic anuran identification using noise removal and audio activity detection
    Alonso, Jesus B.
    Cabrera, Josue
    Shyamnani, Rohit
    Travieso, Carlos M.
    Bolanos, Federico
    Garcia, Adrian
    Villegas, Alexander
    Wainwright, Mark
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 83 - 92
  • [6] Language Identification: A Tutorial
    Ambikairajah, Eliathamby
    Li, Haizhou
    Wang, Liang
    Yin, Bo
    Sethu, Vidhyasaharan
    [J]. IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2011, 11 (02) : 82 - 108
  • [7] [Anonymous], 2015, Language Identification Using Spectral and Prosodic Features
  • [8] [Anonymous], 2015, ARXIV150400923
  • [9] [Anonymous], 2011, 12 ANN C INT SPEECH
  • [10] [Anonymous], 2013, ISMIR