Unsupervised Adaptation for Deep Neural Network using Linear Least Square Method

被引:0
作者
Hsiao, Roger [1 ]
Ng, Tim [1 ]
Tsakalidis, Stavros [1 ]
Nguyen, Long [1 ]
Schwartz, Richard [1 ]
机构
[1] Raytheon BBN Technol, 10 Moulton St, Cambridge, MA 02138 USA
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
deep neural network; unsupervised adaptation; keyword search; SPEAKER ADAPTATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a novel model based adaptation for deep neural networks based on a linear least Square method. Our proposed algorithm can perform unsupervised adaptation even if the auto transcripts may have 60-70% of word error rate. We evaluate our algorithm on low resource languages. from the the IARPA BABEL program, such as Assamese, Bengali. Haitian Creole, Lao and Zulu. Our experiments focus on unsupervised speaker, dialect and environment adaptation and we show that it can improve both speech recognition and keyword search performance.
引用
收藏
页码:2887 / 2891
页数:5
相关论文
共 22 条
[1]  
[Anonymous], P INTERSPEECH
[2]  
[Anonymous], P IEEE INT C AC SPEE
[3]  
[Anonymous], 2013, P INTERSPEECH
[4]   A global optimum approach for one-layer neural networks [J].
Castillo, E ;
Fontenla-Romero, O ;
Guijarro-Berdiñas, B ;
Alonso-Betanzos, A .
NEURAL COMPUTATION, 2002, 14 (06) :1429-1449
[5]   Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition [J].
Dahl, George E. ;
Yu, Dong ;
Deng, Li ;
Acero, Alex .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :30-42
[6]   Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains [J].
Gauvain, Jean-Luc ;
Lee, Chin-Hui .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (02) :291-298
[7]  
Grezl F., 2013, P IEEE WORKSH AUT SP
[8]   Deep Neural Networks for Acoustic Modeling in Speech Recognition [J].
Hinton, Geoffrey ;
Deng, Li ;
Yu, Dong ;
Dahl, George E. ;
Mohamed, Abdel-rahman ;
Jaitly, Navdeep ;
Senior, Andrew ;
Vanhoucke, Vincent ;
Patrick Nguyen ;
Sainath, Tara N. ;
Kingsbury, Brian .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :82-97
[9]  
Hsiao R., 2014, P INTERSPEECH
[10]  
Hsiao R., 2013, P IEEE WORKSH AUT SP