Integrating acoustic, prosodic and phonotactic features for spoken language identification

被引:0
|
作者
Tong, Rong [1 ]
Ma, Bin [1 ]
Zhu, Donglai [1 ]
Li, Haizhou [1 ]
Chng, Eng Siong [1 ]
机构
[1] Inst Infocomm Res, Singapore, Singapore
来源
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram. phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset.
引用
收藏
页码:205 / 208
页数:4
相关论文
共 50 条
  • [1] Neural network classifiers for language identification using phonotactic and prosodic features
    Mary, L
    Rao, KS
    Yegnanarayana, B
    2005 INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSING, PROCEEDINGS, 2005, : 404 - 408
  • [2] Fusion of Contrastive Acoustic Models for Parallel Phonotactic Spoken Language Identification
    Sim, Khe Chai
    Li, Haizhou
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 541 - 544
  • [3] Fusion of phonotactic and prosodic knowledge for language identification
    Lin, Chi-Yueh
    Wang, Hsiao-Chuan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 425 - 428
  • [4] Spoken Language Recognition With Prosodic Features
    Ng, Raymond W. M.
    Lee, Tan
    Leung, Cheung-Chi
    Ma, Bin
    Li, Haizhou
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (09): : 1841 - 1853
  • [5] American Dialect Identification using Phonotactic and Prosodic Features
    Etman, A.
    Beex, A. A.
    2015 SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2015, : 963 - 970
  • [6] Phonotactic spoken language identification with limited training data
    Peche, Marius
    Davel, Marelie
    Barnard, Etienne
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1661 - 1664
  • [7] PROSODIC ATTRIBUTE MODEL FOR SPOKEN LANGUAGE IDENTIFICATION
    Ng, Raymond W. M.
    Leung, Cheung-Chi
    Lee, Tan
    Ma, Bin
    Li, Haizhou
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5022 - 5025
  • [8] Prosodic features for language identification
    Mary, Leena
    Yegnanarayana, B.
    ICSCN 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING COMMUNICATIONS AND NETWORKING, 2008, : 57 - +
  • [9] Combining Acoustic-Prosodic, Lexical, and Phonotactic Features for Automatic Deception Detection
    Levitan, Sarah Ita
    An, Guozhen
    Ma, Min
    Levitan, Rivka
    Rosenberg, Andrew
    Hirschberg, Julia
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2006 - 2010
  • [10] Effective Preprocessing of Speech and Acoustic Features Extraction for Spoken Language Identification
    Kumar, Abhijeet
    Hemani, H.
    Sakthivel, N.
    Chaturvedi, S.
    2015 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES AND MANAGEMENT FOR COMPUTING, COMMUNICATION, CONTROLS, ENERGY AND MATERIALS (ICSTM), 2015, : 81 - 88