BAYESIAN PHONOTACTIC LANGUAGE MODEL FOR ACOUSTIC UNIT DISCOVERY

被引:0
作者
Ondel, Lucas [1 ]
Burget, Lukas [1 ]
Cernocky, Jan [1 ]
Kesiraju, Santosh [2 ]
机构
[1] Brno Univ Technol, Brno, Czech Republic
[2] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India
来源
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2017年
关键词
Bayesian non-parametric; Variational Bayes; acoustic unit discovery;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent work on Acoustic Unit Discovery (AUD) has led to the development of a non-parametric Bayesian phone-loop model where the prior over the probability of the phone-like units is assumed to be sampled from a Dirichlet Process (DP). In this work, we propose to improve this model by incorporating a Hierarchical Pitman-Yor based bigram Language Model on top of the units' transitions. This new model makes use of the phonotactic context information but assumes a fixed number of units. To remedy this limitation we first train a DP phone-loop model to infer the number of units, then, the bigram phone-loop is initialized from the DP phone-loop and trained until convergence of its parameters. Results show an absolute improvement of 1-2% on the Normalized Mutual Information (NMI) metric. Furthermore, we show that, combined with Multilingual Bottleneck (MBN) features the model yields a same or higher NMI as an English phone recogniser trained on TIMIT.
引用
收藏
页码:5750 / 5754
页数:5
相关论文
共 14 条
  • [1] [Anonymous], 2010, BAYESIAN NONPARAMETR
  • [2] [Anonymous], 2011, WORKSH AUT SPEECH RE
  • [3] MIXTURES OF DIRICHLET PROCESSES WITH APPLICATIONS TO BAYESIAN NONPARAMETRIC PROBLEMS
    ANTONIAK, CE
    [J]. ANNALS OF STATISTICS, 1974, 2 (06) : 1152 - 1174
  • [4] Variational Inference for Dirichlet Process Mixtures
    Blei, David M.
    Jordan, Michael I.
    [J]. BAYESIAN ANALYSIS, 2006, 1 (01): : 121 - 143
  • [5] Garofolo J.S., 1993, LINGUIST DATA CONSOR, DOI DOI 10.35111/17GK-BN40
  • [6] Grezl F., 2014, P 4 INT WORKSH SPOK, P39
  • [7] Johnson Mark, 2006, Advances in Neural Information Processing Systems, V19, P641
  • [8] Lee C.-Y., 2012, P 50 ANN M ASS COMPU, P40
  • [9] Novotney S, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P236
  • [10] Variational Inference for Acoustic Unit Discovery
    Ondel, Lucas
    Burget, Lukas
    Cernocky, Jan
    [J]. SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 80 - 86