Incorporation of discriminative n-grams to improve a phonotactic language recognizer based on i-vectors

被引：0

作者：

Salamea Palaciosi, Christian ^{[1
,2
]}

Fernando D'Haro, Luis ^{[1
]}

Cordoba, Ricardo ^{[1
]}

Angel Caraballo, Miguel ^{[1
]}

机构：

[1] Univ Politecn Madrid, ETSI Telecomunicac, Dept Ingn Elect, Grp Tecnol Habla, Ciudad Univ S-N, E-28040 Madrid, Spain

[2] Univ Politecn Salesiana Ecuador, Cuenca, Ecuador

来源：

PROCESAMIENTO DEL LENGUAJE NATURAL | 2013年 / 51期

关键词：

Posteriorgram; i-Vectors; discriminate rankings; phonotactic; n-grams;

D O I：

暂无

中图分类号：

H0 [语言学];

学科分类号：

030303 ; 0501 ; 050102 ;

摘要：

This paper describes a novel technique that allows the combination of the information from two different phonotactic systems with the goal of improving the results of an automatic language recognition system. The first system is based on the creation of posteriorgram counts used for the generation of i-vectors, and the second system is a variation of the first one that takes into account the most discriminative n-grams as a function of their occurrence in one language compared to all other languages. The proposed technique allows a relative improvement of 8.63% on C-avg over the official set used for the ALBAYZIN 2012 LRE evaluation.

引用

页码：145 / 152

页数：8

共 15 条

[1] Cavnar W. B., 1994, P SDAIR 94 3 ANN S D
[2] Cordoba R, 2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, P1921
[3] D'Haro L. F., 2012, P INTERSPEECH, P9
[4] D'Haro L. F., 2013, ICASSP
[5] Front-End Factor Analysis for Speaker Verification
Dehak, Najim
Kenny, Patrick J.
Dehak, Reda
Dumouchel, Pierre
Ouellet, Pierre
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798
[6] Diez M., 2013, INTERSPEECH
[7] Kockmann M., 2010, P ICSPL MAK CHIB JAP
[8] Lightly supervised and unsupervised acoustic model training
Lamel, L
Gauvain, JL
Adda, G
[J]. COMPUTER SPEECH AND LANGUAGE, 2002, 16 (01) : 115 - 129
[9] Martinez D, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P868
[10] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
Povey, Daniel
Burget, Lukas
Agarwal, Mohit
Akyazi, Pinar
Feng, Kai
Ghoshal, Arnab
Glembek, Ondrej
Goel, Nagendra Kumar
Karafiat, Martin
Rastrow, Ariya
Rose, Richard C.
Schwarz, Petr
Thomas, Samuel
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333

← 1 2 →