Significance of neural phonotactic models for large-scale spoken language identification

被引:0
|
作者
Srivastava, Brij Mohan Lal [1 ]
Vydana, Hari [1 ]
Vuppala, Anil Kumar [1 ]
Shrivastava, Manish [1 ]
机构
[1] Int Inst Informat Technol, Language Technol Res Ctr, Hyderabad, Andhra Pradesh, India
来源
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2017年
关键词
RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language identification (LID) is vital frontend for spoken dialogue systems operating in diverse linguistic settings to reduce recognition and understanding errors. Existing LID systems which use low-level signal information for classification do not scale well due to exponential growth of parameters as the classes increase. They also suffer performance degradation due to the inherent variabilities of speech signal. In the proposed approach, we model the language-specific phonotactic information in speech using recurrent neural network for developing an LID system. The input speech signal is tokenized to phone sequences by using a common language-independent phone recognizer with varying phonetic coverage. We establish a causal relationship between phonetic coverage and LID performance. The phonotactics in the observed phone sequences are modeled using statistical and recurrent neural network language models to predict language-specific symbol from a universal phonetic inventory. Proposed approach is robust, computationally light weight and highly scalable. Experiments show that the convex combination of statistical and recurrent neural network language model (RNNLM) based phonotactic models significantly outperform a strong baseline system of Deep Neural Network (DNN) which is shown to surpass the performance of i-vector based approach for LID. The proposed approach outperforms the baseline models in terms of mean F1 score over 176 languages. Further we provide significant information-theoretic evidence to analyze the mechanism of the proposed approach.
引用
收藏
页码:2144 / 2151
页数:8
相关论文
共 20 条
  • [1] Large-scale Neural Systems for Vision and Cognition
    Carpenter, Gail A.
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 3542 - 3547
  • [2] APoc: large-scale identification of similar protein pockets
    Gao, Mu
    Skolnick, Jeffrey
    BIOINFORMATICS, 2013, 29 (05) : 597 - 604
  • [3] A Small-Footprint Accelerator for Large-Scale Neural Networks
    Chen, Tianshi
    Zhang, Shijin
    Liu, Shaoli
    Du, Zidong
    Luo, Tao
    Gao, Yuan
    Liu, Junjie
    Wang, Dongsheng
    Wu, Chengyong
    Sun, Ninghui
    Chen, Yunji
    Temam, Olivier
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2015, 33 (02):
  • [4] Deep Convolutional Neural Networks for Large-scale Speech Tasks
    Sainath, Tara N.
    Kingsbury, Brian
    Saon, George
    Soltau, Hagen
    Mohamed, Abdel-rahman
    Dahl, George
    Ramabhadran, Bhuvana
    NEURAL NETWORKS, 2015, 64 : 39 - 48
  • [5] FontRNN: Generating Large-scale Chinese Fonts via Recurrent Neural Network
    Tang, Shusen
    Xia, Zeqing
    Lian, Zhouhui
    Tang, Yingmin
    Xiao, Jianguo
    COMPUTER GRAPHICS FORUM, 2019, 38 (07) : 567 - 577
  • [6] Classification of large-scale stellar spectra based on deep convolutional neural network
    Liu, W.
    Zhu, M.
    Dai, C.
    He, D. Y.
    Yao, Jiawen
    Tian, H. F.
    Wang, B. Y.
    Wu, K.
    Zhan, Y.
    Chen, B. -Q.
    Luo, A-Li
    Wang, R.
    Cao, Y.
    Yu, X. C.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2019, 483 (04) : 4774 - 4783
  • [7] Regularization of neural network model with distance metric learning for i-vector based spoken language identification
    Lu, Xugang
    Shen, Peng
    Tsao, Yu
    Kawai, Hisashi
    COMPUTER SPEECH AND LANGUAGE, 2017, 44 : 48 - 60
  • [8] Large-scale identification of genes involved in septal pore plugging in multicellular fungi
    Al Mamun, Md. Abdulla
    Cao, Wei
    Nakamura, Shugo
    Maruyama, Jun-ichi
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [9] The benefits and costs of prior exposure: A large-scale study of interference effects in stimulus identification
    Pilotti, Maura
    Chodorow, Martin
    Shono, Yusuke
    AMERICAN JOURNAL OF PSYCHOLOGY, 2009, 122 (02): : 191 - 208
  • [10] Large-scale identification of functional microRNA targeting reveals cooperative regulation of the hemostatic system
    Nourse, J.
    Braun, J.
    Lackner, K.
    Huettelmaier, S.
    Danckwardt, S.
    JOURNAL OF THROMBOSIS AND HAEMOSTASIS, 2018, 16 (11) : 2233 - 2245