Incorporation of discriminative n-grams to improve a phonotactic language recognizer based on i-vectors

被引:0
作者
Salamea Palaciosi, Christian [1 ,2 ]
Fernando D'Haro, Luis [1 ]
Cordoba, Ricardo [1 ]
Angel Caraballo, Miguel [1 ]
机构
[1] Univ Politecn Madrid, ETSI Telecomunicac, Dept Ingn Elect, Grp Tecnol Habla, Ciudad Univ S-N, E-28040 Madrid, Spain
[2] Univ Politecn Salesiana Ecuador, Cuenca, Ecuador
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2013年 / 51期
关键词
Posteriorgram; i-Vectors; discriminate rankings; phonotactic; n-grams;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes a novel technique that allows the combination of the information from two different phonotactic systems with the goal of improving the results of an automatic language recognition system. The first system is based on the creation of posteriorgram counts used for the generation of i-vectors, and the second system is a variation of the first one that takes into account the most discriminative n-grams as a function of their occurrence in one language compared to all other languages. The proposed technique allows a relative improvement of 8.63% on C-avg over the official set used for the ALBAYZIN 2012 LRE evaluation.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 15 条
  • [1] Cavnar W. B., 1994, P SDAIR 94 3 ANN S D
  • [2] Cordoba R, 2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, P1921
  • [3] D'Haro L. F., 2012, P INTERSPEECH, P9
  • [4] D'Haro L. F., 2013, ICASSP
  • [5] Front-End Factor Analysis for Speaker Verification
    Dehak, Najim
    Kenny, Patrick J.
    Dehak, Reda
    Dumouchel, Pierre
    Ouellet, Pierre
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798
  • [6] Diez M., 2013, INTERSPEECH
  • [7] Kockmann M., 2010, P ICSPL MAK CHIB JAP
  • [8] Lightly supervised and unsupervised acoustic model training
    Lamel, L
    Gauvain, JL
    Adda, G
    [J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) : 115 - 129
  • [9] Martinez D, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P868
  • [10] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Povey, Daniel
    Burget, Lukas
    Agarwal, Mohit
    Akyazi, Pinar
    Feng, Kai
    Ghoshal, Arnab
    Glembek, Ondrej
    Goel, Nagendra Kumar
    Karafiat, Martin
    Rastrow, Ariya
    Rose, Richard C.
    Schwarz, Petr
    Thomas, Samuel
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333