Automatic Pronunciation Generation by Utilizing a Semi-supervised Deep Neural Networks

被引：0

作者：

Takahashi, Naoya ^{[1
]}

Naghibi, Tofigh ^{[2
]}

Pfister, Beat ^{[2
]}

机构：

[1] Sony Corp, Tokyo, Japan

[2] Swiss Fed Inst Technol, Speech Proc Grp, Zurich, Switzerland

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

speech recognition; deep neural networks; semi-supervised learning; dictionary; sub-word unit; k-dimensional Viterbi; SPEECH RECOGNITION;

D O I：

10.21437/Interspeech.2016-761

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Phonemic or phonetic sub-word units are the most commonly used atomic elements to represent speech signals in modern ASRs. However they are not the optimal choice due to several reasons such as: large amount of effort required to handcraft a pronunciation dictionary, pronunciation variations, human mistakes and under-resourced dialects and languages. Here, we propose a data-driven pronunciation estimation and acoustic modeling method which only takes the orthographic transcription to jointly estimate a set of sub-word units and a reliable dictionary. Experimental results show that the proposed method which is based on semi-supervised training of a deep neural network largely outperforms phoneme based continuous speech recognition on the TIMIT dataset.

引用

页码：1141 / 1145

页数：5

共 50 条

[1] SEMI-SUPERVISED TRAINING STRATEGIES FOR DEEP NEURAL NETWORKS
Gibson, Matthew
Cook, Gary
Zhan, Puming
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 77 - 83
[2] Semi-supervised Deep Domain Adaptation via Coupled Neural Networks
Ding, Zhengming
Nasrabadi, Nasser M.
Fu, Yun
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5214 - 5224
[3] Semi-Supervised Clustering with Neural Networks
Shukla, Ankita
Cheema, Gullal S.
Anand, Saket
2020 IEEE SIXTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2020), 2020, : 152 - 161
[4] SEMI-SUPERVISED HYPERSPECTRAL UNMIXING WITH VERY DEEP CONVOLUTIONAL NEURAL NETWORKS
Bai, Jiayu
Feng, Ruyi
Wang, Lizhe
Li, Hao
Li, Fengpeng
Zhong, Yanfei
Zhang, Liangpei
IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 2400 - 2403
[5] Semi-supervised learning with convolutional neural networks for UAV images automatic recognition
Amorim, Willian Paraguassu
Tetila, Everton Castelao
Pistori, Hemerson
Papa, Joao Paulo
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 164
[6] Data Augmentation and Semi-supervised Learning for Deep Neural Networks-based Text Classifier
Shim, Heereen
Luca, Stijn
Lowet, Dietwig
Vanrumste, Bart
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1119 - 1126
[7] Comparison of Semi-supervised Deep Neural Networks for Anomaly Detection in Industrial Processes
Chadha, Gavneet Singh
Rabbani, Arfyan
Schwung, Andreas
2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2019, : 214 - 219
[8] Semi-Supervised Convolutional Neural Networks for Human Activity Recognition\
Zeng, Ming
Yu, Tong
Wang, Xiao
Nguyen, Le T.
Mengshoel, Ole J.
Lane, Ian
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 522 - 529
[9] Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks
Rosario Campomanes-Alvarez, Blanca
Quiros, Pelayo
Fernandez, Bernardo
APPLICATIONS OF INTELLIGENT SYSTEMS, 2018, 310 : 19 - 29
[10] Semi-supervised Maximum Mutual Information Training of Deep Neural Network Acoustic Models
Manohar, Vimal
Povey, Daniel
Khudanpur, Sanjeev
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2630 - 2634

← 1 2 3 4 5 →