Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion

被引:0
|
作者
Xie, Xurong [1 ,2 ]
Liu, Xunying [1 ,2 ]
Lee, Tan [1 ]
Wang, Lan [2 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
来源
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年
关键词
articulatory inversion; stacked; deep neural network; mixture density network; EMA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Acoustic-to-articulatory inversion predicting articulatory movement based on the acoustic signal is useful for many applications like talking head, speech recognition, and education. DNN based technologies have achieved the state-of-the-art performance in the area. This paper investigates different stacked network architectures for acoustic-to-articulatory inversion. Two levels of DNNs or mixture density networks (MDNs) can be connected using different types of auxiliary features, including bottleneck features, directly generated features, and predicted articulatory features via MLPG algorithm extracted from the first level network. For the experiments, stacked systems using DNNs, time-delay DNNs (TDNNs), RNNs and MDNs were evaluated on both the MNGU0 English EMA database and AIMSL Chinese EMA database. Finally, on the default configurations of MNGU0 data using LSF acoustic features, the proposed stacked system using feed-forward MDNs with ellipsoid variance and MLPG generated features got 0.718mm in RMSE, which is similar to the RNN and RNN-MDN BLSTM systems with slower and more difficult training stage.
引用
收藏
页码:36 / 40
页数:5
相关论文
共 50 条
  • [21] Boundary Contraction Training for Acoustic Models based on Discrete Deep Neural Networks
    Takeda, Ryu
    Kanda, Naoyuki
    Nukaga, Nobuo
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1063 - 1067
  • [22] Deepfake Speech Detection: Approaches from Acoustic Features to Deep Neural Networks
    Unoki, Masashi
    Li, Kai
    Chaiwongyen, Anuwat
    Nguyen, Quoc-Huy
    Zaman, Khalid
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2025, E108D (04) : 300 - 310
  • [23] INVESTIGATION OF MIXTURE SPLITTING CONCEPT FOR TRAINING LINEAR BOTTLENECKS OF DEEP NEURAL NETWORK ACOUSTIC MODELS
    Tahir, Muhammad Ali
    Wiesler, Simon
    Schlueter, Ralf
    Ney, Hermann
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4614 - 4618
  • [24] Deep Dive into Deep Neural Networks with Flows
    Hainaut, Adrien
    Giot, Romain
    Bourqui, Romain
    Auber, David
    IVAPP: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 3: IVAPP, 2020, : 231 - 239
  • [25] Nuclei Detection Using Mixture Density Networks
    Koohababni, Navid Alemi
    Jahanifar, Mostafa
    Gooya, Ali
    Rajpoot, Nasir
    MACHINE LEARNING IN MEDICAL IMAGING: 9TH INTERNATIONAL WORKSHOP, MLMI 2018, 2018, 11046 : 241 - 248
  • [26] Analyzing Networks-on-Chip based Deep Neural Networks
    Ascia, Giuseppe
    Catania, Vincenzo
    Monteleone, Salvatore
    Palesi, Maurizio
    Patti, Davide
    Jose, John
    PROCEEDINGS OF THE 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS'19), 2019,
  • [27] DEEP NEURAL NETWORKS EMPLOYING MULTI-TASK LEARNING AND STACKED BOTTLENECK FEATURES FOR SPEECH SYNTHESIS
    Wu, Zhizheng
    Valentini-Botinhao, Cassia
    Watts, Oliver
    King, Simon
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4460 - 4464
  • [28] Identification of Ultra High Frequency Acoustic Coda Waves Using Deep Neural Networks
    Thati, Venu Babu
    Smagin, Nikolay
    Dahmani, Hatem
    Carlier, Julien
    Alouani, Ihsen
    IEEE SENSORS JOURNAL, 2021, 21 (18) : 20640 - 20647
  • [29] Biot's equations-based reservoir parameter inversion using deep neural networks
    Xiong, Fansheng
    Yong, Heng
    Chen, Hua
    Wang, Han
    Shen, Weidong
    JOURNAL OF GEOPHYSICS AND ENGINEERING, 2021, 18 (06) : 862 - 874
  • [30] Oscillator Simulation with Deep Neural Networks
    Ul Rahman, Jamshaid
    Danish, Sana
    Lu, Dianchen
    MATHEMATICS, 2024, 12 (07)