Integrating acoustic, prosodic and phonotactic features for spoken language identification

被引:0
|
作者
Tong, Rong [1 ]
Ma, Bin [1 ]
Zhu, Donglai [1 ]
Li, Haizhou [1 ]
Chng, Eng Siong [1 ]
机构
[1] Inst Infocomm Res, Singapore, Singapore
来源
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram. phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset.
引用
收藏
页码:205 / 208
页数:4
相关论文
共 50 条
  • [21] Spoken Language Identification Using Spectral Features
    Koolagudi, Shashidhar G.
    Rastogi, Deepika
    Rao, K. Sreenivasa
    CONTEMPORARY COMPUTING, 2012, 306 : 496 - +
  • [22] Enhanced Spectral Features for Spoken Language Identification
    Ziaei, Ali
    Ahadi, Seyed Mohammad
    Yeganeh, Hojatollah
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1496 - 1499
  • [23] PHONOTACTIC SPOKEN LANGUAGE RECOGNITION: USING DIVERSELY ADAPTED ACOUSTIC MODELS IN PARALLEL PHONE RECOGNIZERS
    Leung, Cheung-Chi
    Ma, Bin
    Li, Haizhou
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 108 - 111
  • [24] Language Identification System using MFCC and Prosodic Features
    Bhattacharjee, Utpal
    KshirodSarmah
    2013 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND SIGNAL PROCESSING (ISSP), 2013, : 194 - 197
  • [25] Text Implicates Prosodic Ambiguity: A Corpus for Intention Identification of the Korean Spoken Language
    Cho, Won Ik
    Kim, Nam Soo
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (01)
  • [26] Phonotactic language identification for singing
    Kruspe, Anna M.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3319 - 3323
  • [27] Identification of four class emotion from Indonesian spoken language using acoustic and lexical features
    Kasyidi, Fatan
    Lestari, Dessi Puji
    INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE (ICODIS), 2018, 971
  • [28] Text- and speech-based phonotactic models for spoken language identification of Basque and Spanish
    Guijarrubia, Victor G.
    Ines Torres, M.
    PATTERN RECOGNITION LETTERS, 2010, 31 (06) : 523 - 532
  • [29] Prosodic Manifestations of Confidence and Uncertainty in Spoken Language
    Pon-Barry, Heather
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 74 - 77
  • [30] Exploring Residual Cepstral Features for Spoken Language Identification
    Hora, Baveet Singh
    Parmar, Krishna
    Machhar, Shrey
    Patil, Hemant A.
    Praveen, Kiran
    Radhakrishnan, Balaji
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 131 - 138