Integrating acoustic, prosodic and phonotactic features for spoken language identification

被引:0
作者
Tong, Rong [1 ]
Ma, Bin [1 ]
Zhu, Donglai [1 ]
Li, Haizhou [1 ]
Chng, Eng Siong [1 ]
机构
[1] Inst Infocomm Res, Singapore, Singapore
来源
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 | 2006年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram. phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset.
引用
收藏
页码:205 / 208
页数:4
相关论文
共 50 条
  • [41] Performance Evaluation of Deep Bottleneck Features for Spoken Language Identification
    Jiang, Bing
    Song, Yan
    Wei, Si
    Wang, Meng-Ge
    McLoughlin, Ian
    Dai, Li-Rong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 143 - +
  • [42] automatic language identification for berber and arabic languages using prosodic features
    Lounnas, Khlaed
    Demri, Lyes
    Teffahi, Hocine
    Falek, Leila
    PROCEEDINGS 2018 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SCIENCES AND TECHNOLOGIES IN MAGHREB (CISTEM), 2018, : 239 - 242
  • [43] Phonotactic Language Recognition Using MLP Features
    BenZeghiba, Mohamed Faouzi
    Gauvain, Jean-Luc
    Lamel, Lori
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2039 - 2042
  • [44] Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition
    Lee, Hung-Shin
    Tsao, Yu
    Jeng, Shyh-Kang
    Wang, Hsin-Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 3065 - 3079
  • [45] BAYESIAN PHONOTACTIC LANGUAGE MODEL FOR ACOUSTIC UNIT DISCOVERY
    Ondel, Lucas
    Burget, Lukas
    Cernocky, Jan
    Kesiraju, Santosh
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5750 - 5754
  • [46] Spoken document summarization using acoustic, prosodic and semantic information
    Huang, CL
    Hsieh, CH
    Wu, CH
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 434 - 437
  • [47] Individual differences in acoustic-prosodic entrainment in spoken dialogue
    Weise, Andreas
    Levitan, Sarah Ita
    Hirschberg, Julia
    Levitan, Rivka
    SPEECH COMMUNICATION, 2019, 115 : 78 - 87
  • [48] FRAME-BASED PHONOTACTIC LANGUAGE IDENTIFICATION
    Han, Kyu
    Pelecanos, Jason
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 303 - 306
  • [49] Integrating Prosodic Features in Extractive Meeting Summarization
    Xie, Shasha
    Hakkani-Tuer, Dilek
    Favre, Benoit
    Liu, Yang
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 387 - +
  • [50] Sparse Representation based Language Identification using Prosodic Features for Indian Languages
    Singh, Om Prakash
    Haris, B. C.
    Sinha, Rohit
    Chettri, Bhusan
    Pradhan, Abhishek
    2013 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2013,