Integrating acoustic, prosodic and phonotactic features for spoken language identification

被引:0
作者
Tong, Rong [1 ]
Ma, Bin [1 ]
Zhu, Donglai [1 ]
Li, Haizhou [1 ]
Chng, Eng Siong [1 ]
机构
[1] Inst Infocomm Res, Singapore, Singapore
来源
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram. phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset.
引用
收藏
页码:205 / 208
页数:4
相关论文
共 50 条
  • [31] Spoken Indian language identification: a review of features and databases
    Aarti, Bakshi
    Kopparapu, Sunil Kumar
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (04):
  • [32] Spoken Indian language identification: a review of features and databases
    BAKSHI AARTI
    SUNIL KUMAR KOPPARAPU
    Sādhanā, 2018, 43
  • [33] Selecting Phonotactic Features for Language Recognition
    Tong, Rong
    Ma, Bin
    Li, Haizhou
    Chng, Eng Siong
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 737 - +
  • [34] Scaling Laws for Phonotactic Complexity in Spoken English Language Data
    Baumann, Andreas
    Kazmierski, Kamil
    Matzinger, Theresa
    LANGUAGE AND SPEECH, 2021, 64 (03) : 693 - 704
  • [35] A Study of Term Weighting in Phonotactic Approach to Spoken Language Recognition
    Boonsuk, Sirinoot
    Zhu, Donglai
    Ma, Bin
    Suchato, Atiwong
    Punyabukkana, Proadpran
    Thatphithakku, Nattanun
    Wutiwiwatchai, Chai
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2722 - +
  • [36] Spoken English Assessment System for Non-Native Speakers Using Acoustic and Prosodic Features
    Shi, Qin
    Li, Kun
    Zhang, ShiLei
    Chu, Stephen M.
    Xiao, Ji
    Ou, ZhiJian
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1874 - +
  • [37] On acoustic diversification front-end for spoken language identification
    Sim, Khe Chai
    Li, Haizhou
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (05): : 1029 - 1037
  • [38] Improving Phonotactic Language Recognition with Acoustic Adaptation
    Shen, Wade
    Reynolds, Douglas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2105 - 2108
  • [39] LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification
    Liu, Hexin
    Perera, Leibny Paola Garcia
    Khong, Andy W. H.
    Styles, Suzy J.
    Khudanpur, Sanjeev
    INTERSPEECH 2022, 2022, : 2233 - 2237
  • [40] Prosodic Parallelism-Comparing Spoken and Written Language
    Wiese, Richard
    FRONTIERS IN PSYCHOLOGY, 2016, 7