COMPARING RNNS AND LOG-LINEAR INTERPOLATION OF IMPROVED SKIP-MODEL ON FOUR BABEL LANGUAGES: CANTONESE, PASHTO, TAGALOG, TURKISH

被引:0
作者
Singh, Mittul [1 ]
Klakow, Dietrich [1 ]
机构
[1] Univ Saarland, D-66123 Saarbrucken, Germany
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
RNNs; log-linear interpolation; skip models; smoothing; under researched languages;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recurrent neural networks (RNNs) are a very recent technique to model long range dependencies in natural languages. They have clearly outperformed trigrams and other more advanced language modeling techniques by using non-linearly modeling long range dependencies. An alternative is to use log-linear interpolation of skip models (i.e. skip bigrams and skip trigrams). The method as such has been published earlier. In this paper we investigate the impact of different smoothing techniques on the skip models as a measure of their overall performance. One option is to use automatically trained distance clusters (both hard and soft) to increase robustness and to combat sparseness in the skip model. We also investigate alternative smoothing techniques on word level. For skip bigrams when skipping a small number of words Kneser-Ney smoothing (KN) is advantageous. For a larger number of words being skipped Dirichlet smoothing performs better. In order to exploit the advantages of both KN and Dirichlet smoothing we propose a new unified smoothing technique. Experiments are performed on four Babel languages: Cantonese, Pashto, Tagalog and Turkish. RNNs and log-linearly interpolated skip models are on par if the skip models are trained with standard smoothing techniques. Using the improved smoothing of the skip models along with distance clusters, we can clearly outperform RNNs by about 8-11 % in perplexity across all four languages.
引用
收藏
页码:8416 / 8420
页数:5
相关论文
共 10 条
  • [1] Brown P. F., 1992, Computational Linguistics, V18, P467
  • [2] Jelinek F., 1980, Pattern Recognition in Practice. Proceedings of an International Workshop, P381
  • [3] Klakow Dietrich, ICSLP
  • [4] Lidstone G.J., 1920, T FACULTY ACTUARIES, V8, P182
  • [5] Mackay D.J. C., 1994, Natural Language Engineering, V1, P1
  • [6] Mikolov T., INTERSPEECH, P1045
  • [7] Momtazi Saeedeh, INTERSPEECH, P1800
  • [8] ON STRUCTURING PROBABILISTIC DEPENDENCES IN STOCHASTIC LANGUAGE MODELING
    NEY, H
    ESSEN, U
    KNESER, R
    [J]. COMPUTER SPEECH AND LANGUAGE, 1994, 8 (01) : 1 - 38
  • [9] SAUL L, 1997, P 2 C EMP METH NAT L, P81
  • [10] A study of smoothing methods for language models applied to information retrieval
    Zhai, CX
    Lafferty, J
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2004, 22 (02) : 179 - 214