Spoken Language Recognition: From Fundamentals to Practice

被引:204
作者
Li, Haizhou [1 ,2 ]
Ma, Bin [1 ]
Lee, Kong Aik [1 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[2] Univ New S Wales, Kensington, NSW 2052, Australia
关键词
Acoustic features; calibration; classifier; fusion; language recognition evaluation (LRE); phonotactic features; spoken language recognition; tokenization; vector space modeling; SUPPORT VECTOR MACHINES; JOINT FACTOR-ANALYSIS; HIDDEN MARKOV-MODELS; FRONT-END; PHONOTACTIC FEATURES; SPEAKER; IDENTIFICATION; COMPENSATION; MIXTURES; FUSION;
D O I
10.1109/JPROC.2012.2237151
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spoken language recognition refers to the automatic process through which we determine or verify the identity of the language spoken in a speech sample. We study a computational framework that allows such a decision to be made in a quantitative manner. In recent decades, we have made tremendous progress in spoken language recognition, which benefited from technological breakthroughs in related areas, such as signal processing, pattern recognition, cognitive science, and machine learning. In this paper, we attempt to provide an introductory tutorial on the fundamentals of the theory and the state-of-the-art solutions, from both phonological and computational aspects. We also give a comprehensive review of current trends and future research directions using the language recognition evaluation (LRE) formulated by the National Institute of Standards and Technology (NIST) as the case studies.
引用
收藏
页码:1136 / 1159
页数:24
相关论文
共 150 条
[1]   Language Identification: A Tutorial [J].
Ambikairajah, Eliathamby ;
Li, Haizhou ;
Wang, Liang ;
Yin, Bo ;
Sethu, Vidhyasaharan .
IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2011, 11 (02) :82-108
[2]  
[Anonymous], SPRINGER HDB SPEECH
[3]  
[Anonymous], 1971, The SMART Retrieval System-Experiments in Automatic Document Processing
[4]  
[Anonymous], P EUROSPEECH 1995
[5]  
[Anonymous], 2009, Ethnologue: languages of the world
[6]  
[Anonymous], 2001, Discrete-Time Speech Signal Processing:Principles and Practice
[7]  
[Anonymous], P EUROSPEECH 03
[8]  
[Anonymous], 2004, ODYSSEY SPEAKER LANG
[9]  
[Anonymous], 2008, Introduction to information retrieval
[10]  
[Anonymous], P INTERSPEECH