Integrating acoustic, prosodic and phonotactic features for spoken language identification

被引：0

作者：

Tong, Rong ^{[1
]}

Ma, Bin ^{[1
]}

Zhu, Donglai ^{[1
]}

Li, Haizhou ^{[1
]}

Chng, Eng Siong ^{[1
]}

机构：

[1] Inst Infocomm Res, Singapore, Singapore

来源：

2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram. phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset.

引用

页码：205 / 208

页数：4

共 50 条

[21] Spoken Language Identification Using Spectral Features
Koolagudi, Shashidhar G.
Rastogi, Deepika
Rao, K. Sreenivasa
CONTEMPORARY COMPUTING, 2012, 306 : 496 - +
[22] Enhanced Spectral Features for Spoken Language Identification
Ziaei, Ali
Ahadi, Seyed Mohammad
Yeganeh, Hojatollah
ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1496 - 1499
[23] PHONOTACTIC SPOKEN LANGUAGE RECOGNITION: USING DIVERSELY ADAPTED ACOUSTIC MODELS IN PARALLEL PHONE RECOGNIZERS
Leung, Cheung-Chi
Ma, Bin
Li, Haizhou
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 108 - 111
[24] Language Identification System using MFCC and Prosodic Features
Bhattacharjee, Utpal
KshirodSarmah
2013 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING (ISSP), 2013, : 194 - 197
[25] Text Implicates Prosodic Ambiguity: A Corpus for Intention Identification of the Korean Spoken Language
Cho, Won Ik
Kim, Nam Soo
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (01)
[26] Phonotactic language identification for singing
Kruspe, Anna M.
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3319 - 3323
[27] Identification of four class emotion from Indonesian spoken language using acoustic and lexical features
Kasyidi, Fatan
Lestari, Dessi Puji
INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE (ICODIS), 2018, 971
[28] Text- and speech-based phonotactic models for spoken language identification of Basque and Spanish
Guijarrubia, Victor G.
Ines Torres, M.
PATTERN RECOGNITION LETTERS, 2010, 31 (06) : 523 - 532
[29] Prosodic Manifestations of Confidence and Uncertainty in Spoken Language
Pon-Barry, Heather
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 74 - 77
[30] Exploring Residual Cepstral Features for Spoken Language Identification
Hora, Baveet Singh
Parmar, Krishna
Machhar, Shrey
Patil, Hemant A.
Praveen, Kiran
Radhakrishnan, Balaji
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 131 - 138

← 1 2 3 4 5 →