Feature Extraction Methods in Language Identification: A Survey

被引：26

作者：

Deshwal, Deepti ^{[1
,2
]}

Sangwan, Pardeep ^{[1
]}

Kumar, Divya ^{[2
]}

机构：

[1] Maharaja Surajmal Inst Technol, Dept ECE, New Delhi, India

[2] IFTM Univ, Dept ECE, Moradabad, Uttar Pradesh, India

来源：

WIRELESS PERSONAL COMMUNICATIONS | 2019年 / 107卷 / 04期

关键词：

Feature extraction; Language identification; Noise compensation; Mel-frequency cepstral coefficients (MFCC); Shifted delta cepstral coefficients (SDCs); AUTOMATIC SPEECH RECOGNITION; FRONT-END; SPECTRAL FEATURES; NEURAL-NETWORKS; ENHANCEMENT; SPEAKER; ROBUST; NOISE; CLASSIFICATION; SYSTEM;

D O I：

10.1007/s11277-019-06373-3

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Language Identification (LI) is one of the widely emerging field in the areas of speech processing to accurately identify the language from the data base based on some features of the speech signal. LI technologies have a wide set of applications in different spheres due to the growing advancement in the field of artificial intelligence and machine learning. Feature extraction is one of the fundamental and significant process performed in LI. This review presents main paradigms of research in Feature Extraction methods that will provide a deep insight to the researchers about the feature extraction techniques for future studies in LI. Broadly, this review summarizes and compare various feature extraction approaches with and without noise compensation techniques as the current trend is towards robust universal Language Identification framework. This paper categorizes the different feature extraction approaches on the basis of different features, human speech production system/peripheral auditory system, spectral or cepstral analysis, and lastly on the basis of transform. Moreover, the different noise compensation-based feature extraction techniques are also covered in the review. This paper also presents, that Mel-Frequency Cepstral Coefficients (MFCCs) are the most popular approach. Results indicates that MFCC fused with other feature vectors and cleansing approaches gives improved performance as compared to the pure MFCC based Feature Extraction approaches. This study also describes the different categories at the front end of the LI system from research point of view.

引用

页码：2071 / 2103

页数：33

共 112 条

[1] Speech enhancement with an adaptive Wiener filter
Abd El-Fattah, Marwa
Dessouky, Moawad
Abbas, Alaa
Diab, Salaheldin
El-Rabaie, El-Sayed
Al-Nuaimy, Waleed
Alshebeili, Saleh
Abd El-Samie, Fathi
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (01) : 53 - 64
[2] Aggarwal G., 2018, INT J COMPUTERS APPL, P1
[3] Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR
Agrawal, Purvi
Ganapathy, Sriram
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2446 - 2450
[4] Enhanced Forensic Speaker Verification Using a Combination of DWT and MFCC Feature Warping in the Presence of Noise and Reverberation Conditions
Al-Ali, Ahmed Kamil Hasan
Dean, David
Senadji, Bouchra
Chandran, Vinod
Naik, Ganesh R.
[J]. IEEE ACCESS, 2017, 5 : 15400 - 15413
[5] Automatic anuran identification using noise removal and audio activity detection
Alonso, Jesus B.
Cabrera, Josue
Shyamnani, Rohit
Travieso, Carlos M.
Bolanos, Federico
Garcia, Adrian
Villegas, Alexander
Wainwright, Mark
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 72 : 83 - 92
[6] Language Identification: A Tutorial
Ambikairajah, Eliathamby
Li, Haizhou
Wang, Liang
Yin, Bo
Sethu, Vidhyasaharan
[J]. IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2011, 11 (02) : 82 - 108
[7] [Anonymous], 2015, Language Identification Using Spectral and Prosodic Features
[8] [Anonymous], 2015, ARXIV150400923
[9] [Anonymous], 2011, 12 ANN C INT SPEECH
[10] [Anonymous], 2013, ISMIR

← 1 2 3 4 5 6 7 8 9 10 →