Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion

被引：0

作者：

Xie, Xurong ^{[1
,2
]}

Liu, Xunying ^{[1
,2
]}

Lee, Tan ^{[1
]}

Wang, Lan ^{[2
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China

来源：

2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2018年

关键词：

articulatory inversion; stacked; deep neural network; mixture density network; EMA;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Acoustic-to-articulatory inversion predicting articulatory movement based on the acoustic signal is useful for many applications like talking head, speech recognition, and education. DNN based technologies have achieved the state-of-the-art performance in the area. This paper investigates different stacked network architectures for acoustic-to-articulatory inversion. Two levels of DNNs or mixture density networks (MDNs) can be connected using different types of auxiliary features, including bottleneck features, directly generated features, and predicted articulatory features via MLPG algorithm extracted from the first level network. For the experiments, stacked systems using DNNs, time-delay DNNs (TDNNs), RNNs and MDNs were evaluated on both the MNGU0 English EMA database and AIMSL Chinese EMA database. Finally, on the default configurations of MNGU0 data using LSF acoustic features, the proposed stacked system using feed-forward MDNs with ellipsoid variance and MLPG generated features got 0.718mm in RMSE, which is similar to the RNN and RNN-MDN BLSTM systems with slower and more difficult training stage.

引用

页码：36 / 40

页数：5

共 50 条

[41] Systematic investigation of hyperparameters on performance of deep neural networks: application to ovarian cancer phenotypes
Hwangbo, Suhyun
Kim, Se Ik
Cho, Untack
Suh, Dae-Shik
Song, Yong-Sang
Park, Taesung
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 24 (01) : 1 - 15
[42] DEEP NEURAL NETWORKS FOR COCHANNEL SPEAKER IDENTIFICATION
Zhao, Xiaojia
Wang, Yuxuan
Wang, DeLiang
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4824 - 4828
[43] Selective Poisoning Attack on Deep Neural Networks
Kwon, Hyun
Yoon, Hyunsoo
Park, Ki-Woong
SYMMETRY-BASEL, 2019, 11 (07):
[44] AN ENSEMBLE OF DEEP NEURAL NETWORKS FOR OBJECT TRACKING
Zhou, Xiangzeng
Xie, Lei
Zhang, Peng
Zhang, Yanning
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 843 - 847
[45] Deep neural networks with visible intermediate layers
Gao, Ying-Ying
Zhu, Wei-Bin
Zidonghua Xuebao/Acta Automatica Sinica, 2015, 41 (09): : 1627 - 1637
[46] A survey of quantization methods for deep neural networks
Yang C.
Zhang R.
Huang L.
Ti S.
Lin J.
Dong Z.
Chen S.
Liu Y.
Yin X.
Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2023, 45 (10): : 1613 - 1629
[47] An evolutionary building algorithm for Deep Neural Networks
Zemouri, Ryad
2017 12TH INTERNATIONAL WORKSHOP ON SELF-ORGANIZING MAPS AND LEARNING VECTOR QUANTIZATION, CLUSTERING AND DATA VISUALIZATION (WSOM), 2017, : 21 - 27
[48] Application of deep neural networks for multiples attenuation
Song Huan
Mao WeiJian
Tang HuanHuan
CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2021, 64 (08): : 2795 - 2808
[49] Deep Neural Networks for Behavioral Credit Rating
Mercep, Andro
Mrcela, Lovre
Birov, Matija
Kostanjcar, Zvonko
ENTROPY, 2021, 23 (01) : 1 - 18
[50] An Architecture to Accelerate Convolution in Deep Neural Networks
Ardakani, Arash
Condo, Carlo
Ahmadi, Mehdi
Gross, Warren J.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2018, 65 (04) : 1349 - 1362

← 1 2 3 4 5 →