Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation

被引:0
作者
Li, Sheng [1 ]
Lu, Xugang [2 ]
Akita, Yuya [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
[2] Natl Inst Informat & Commun Technol, Kyoto, Japan
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
speaker adaptation; deep neural networks; ensemble modeling; lecture transcription;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we introduce an ensemble speaker modeling using a speaker adaptive training (SAT) deep neural network (SAT-DNN). We first train a speaker-independent DNN (SI-DNN) acoustic model as a universal speaker model (USM). Based on the USM, a SAT-DNN is used to obtain a set of speaker-dependent models by assuming that all other layers except one speaker-dependent (SD) layer are shared among speakers. The speaker ensemble matrix is created by concatenating all of the SD neural weight matrices. With matrix factorization technique, an ensemble speaker subspace is extracted. When testing, an initial model for each target speaker is selected in this ensemble speaker subspace. Then, adaptation is carried out to obtain the final acoustic model for testing. In order to reduce the number of adaptation parameters, low-rank speaker subspace is further explored. We test our algorithm on lecture transcription task. Experimental results showed that our proposed method is effective for unsupervised speaker adaptation.
引用
收藏
页码:2892 / 2896
页数:5
相关论文
共 22 条
  • [1] ABRASH V, 1995, P EUR, P2183, DOI DOI 10.1109/72.182692
  • [2] Anderson E, 1999, Soc Ind Appl Math, V3rd
  • [3] [Anonymous], 2013, INTERSPEECH, DOI 10.21437/interspeech.2013-552
  • [4] [Anonymous], 2014, 15 ANN C INT SPEECH
  • [5] [Anonymous], P ISCSLP
  • [6] [Anonymous], 2011, P ASRU
  • [7] Bergstra J., 2011, NIPS 2011 BIGLEARNIN, V3, P1
  • [8] Huang JT, 2013, INT CONF ACOUST SPEE, P7304, DOI 10.1109/ICASSP.2013.6639081
  • [9] Lee A., 2009, ASIA PACIFIC SIGNAL, P131
  • [10] Liao H, 2013, INT CONF ACOUST SPEE, P7947, DOI 10.1109/ICASSP.2013.6639212