ATTRIBUTE BASED SHARED HIDDEN LAYERS FOR CROSS-LANGUAGE KNOWLEDGE TRANSFER

被引：0

作者：

Arora, Vipul ^{[1
]}

Lahiri, Aditi ^{[1
]}

Reetz, Henning ^{[2
]}

机构：

[1] Univ Oxford, Fac Linguist Philol & Phonet, Oxford, England

[2] Goethe Univ, Frankfurt, Germany

来源：

2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016) | 2016年

基金：

欧洲研究理事会;

关键词：

Deep neural networks adaptation; knowledge transfer; cross-lingual ASR; phonological features; zero-shot learning; NEURAL-NETWORK; SPEECH; RECOGNITION; ADAPTATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferring the hidden layers. An analogous transfer problem is popular as few-shot learning to recognise scantily seen objects based on their meaningful attributes. In similar way, this paper proposes a principled way to represent the hidden layers of DNN in terms of attributes shared across languages. The diverse phoneme sets of different languages can be represented in terms of phonological features that are shared by them. The DNN layers estimating these features could then be transferred in a meaningful and reliable way. Here, we evaluate model transfer from English to German, by comparing the proposed method with other popular methods on the task of phoneme recognition. Experimental results support that apart from providing interpretability to the DNN acoustic models, the proposed framework provides efficient means for their speedy adaptation to different languages, even in the face of scanty adaptation data.

引用

页码：617 / 623

页数：7

共 24 条

[1]

[Anonymous], 2014, AUTOMATIC SPEECH REC

[2]

[Anonymous], 2003, INTERSPEECH

[3]

[Anonymous], 2011, PROC 2011 WORKSHOP A

[4]

Bell P, 2014, INTERSPEECH, P21

[5]

Cernak M, 2015, INT CONF ACOUST SPEE, P4844, DOI 10.1109/ICASSP.2015.7178891

[6] Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPs [J].

Cetin, Oezguer ;

Magimai-Doss, Mathew ;

Livescu, Karen ;

Kantor, Arthur ;

King, Simon ;

Bartels, Chris ;

Frankel, Joe .

2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, :36-+

[7]

Grezl Frantisek, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P7654, DOI 10.1109/ICASSP.2014.6855089

[8]

Grézl F, 2014, IEEE W SP LANG TECH, P48, DOI 10.1109/SLT.2014.7078548

[9]

Huang JT, 2013, INT CONF ACOUST SPEE, P7304, DOI 10.1109/ICASSP.2013.6639081

[10] Point Process Models for Spotting Keywords in Continuous Speech [J].

Jansen, Aren ;

Niyogi, Partha .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (08) :1457-1470

← 1 2 3 →