Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion

被引:0
|
作者
Xie, Xurong [1 ,2 ]
Liu, Xunying [1 ,2 ]
Lee, Tan [1 ]
Wang, Lan [2 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
来源
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年
关键词
articulatory inversion; stacked; deep neural network; mixture density network; EMA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic-to-articulatory inversion predicting articulatory movement based on the acoustic signal is useful for many applications like talking head, speech recognition, and education. DNN based technologies have achieved the state-of-the-art performance in the area. This paper investigates different stacked network architectures for acoustic-to-articulatory inversion. Two levels of DNNs or mixture density networks (MDNs) can be connected using different types of auxiliary features, including bottleneck features, directly generated features, and predicted articulatory features via MLPG algorithm extracted from the first level network. For the experiments, stacked systems using DNNs, time-delay DNNs (TDNNs), RNNs and MDNs were evaluated on both the MNGU0 English EMA database and AIMSL Chinese EMA database. Finally, on the default configurations of MNGU0 data using LSF acoustic features, the proposed stacked system using feed-forward MDNs with ellipsoid variance and MLPG generated features got 0.718mm in RMSE, which is similar to the RNN and RNN-MDN BLSTM systems with slower and more difficult training stage.
引用
收藏
页码:36 / 40
页数:5
相关论文
共 50 条
  • [31] Fast learning in Deep Neural Networks
    Chandra, B.
    Sharma, Rajesh K.
    NEUROCOMPUTING, 2016, 171 : 1205 - 1215
  • [32] Memetic Evolution of Deep Neural Networks
    Lorenzo, Pablo Ribalta
    Nalepa, Jakub
    GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 505 - 512
  • [33] NOISY TRAINING FOR DEEP NEURAL NETWORKS
    Meng, Xiangtao
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 16 - 20
  • [34] Recommending Teammates with Deep Neural Networks
    Goyal, Palash
    Sapienza, Anna
    Ferrara, Emilio
    HT'18: PROCEEDINGS OF THE 29TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA, 2018, : 57 - 61
  • [35] Staging Epileptogenesis with Deep Neural Networks
    Lu, Diyuan
    Bauer, Sebastian
    Neubert, Valentin
    Costard, Lara Sophie
    Rosenow, Felix
    Triesch, Jochen
    ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [36] "Identity Bracelets" for Deep Neural Networks
    Xu, Xiangrui
    Li, Yaqin
    Yuan, Cao
    IEEE ACCESS, 2020, 8 : 102065 - 102074
  • [37] Survey on Testing of Deep Neural Networks
    Wang Z.
    Yan M.
    Liu S.
    Chen J.-J.
    Zhang D.-D.
    Wu Z.
    Chen X.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (05): : 1255 - 1275
  • [38] Improved DDoS Detection Utilizing Deep Neural Networks and Feedforward Neural Networks as Autoencoder
    Yaser, Ahmed Latif
    Mousa, Hamdy M.
    Hussein, Mahmoud
    FUTURE INTERNET, 2022, 14 (08):
  • [39] Causal Interpretability and Uncertainty Estimation in Mixture Density Networks
    Swamy, Gokul
    Das, Arunita
    Niranjan, Shobhit
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT V, 2023, 14258 : 243 - 254
  • [40] IMPROVING DEEP NEURAL NETWORKS USING STATE PROJECTION VECTORS OF SUBSPACE GAUSSIAN MIXTURE MODEL AS FEATURES
    Karthick, Murali B.
    Umesh, S.
    2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 129 - 134