SEMI-SUPERVISED TRAINING OF ACOUSTIC MODELS USING LATTICE-FREE MMI

被引：0

作者：

Manohar, Vimal ^{[1
,2
]}

Hadian, Hossein ^{[1
]}

Povey, Daniel ^{[1
,2
]}

Khudanpur, Sanjeev ^{[1
,2
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年

基金：

美国国家科学基金会;

关键词：

Semi-supervised training; Lattice-free MMI; Sequence training; Automatic speech recognition; SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The lattice-free MMI objective (LF-MMI) has been used in supervised training of state-of-the-art neural network acoustic models for automatic speech recognition (ASR). With large amounts of unsupervised data available, extending this approach to the semi-supervised scenario is of significance. Finite-state transducer (FST) based supervision used with LF-MMI provides a natural way to incorporate uncertainties when dealing with unsupervised data. In this paper, we describe various extensions to standard LF-MMI training to allow the use as supervision of lattices obtained via decoding of unsupervised data. The lattices are rescored with a strong LM. We investigate different methods for splitting the lattices and incorporating frame tolerances into the supervision FST. We report results on different subsets of Fisher English, where we achieve WER recovery of 59-64% using lattice supervision, which is significantly better than using just the best path transcription.

引用

页码：4844 / 4848

页数：5

共 50 条

[21] Sequence Discriminative Training for Offline Handwriting Recognition by an Interpolated CTC and Lattice-Free MMI Objective Function
Hu, Wenping
Cai, Meng
Chen, Kai
Ding, Haisong
Sun, Lei
Liang, Sen
Mo, Xiongjian
Huo, Qiang
2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 61 - 66
[22] LANGUAGE DIARIZATION FOR SEMI-SUPERVISED BILINGUAL ACOUSTIC MODEL TRAINING
Yilmaz, Emre
McLaren, Mitchell
van den Heuvel, Henk
van Leeuwen, David A.
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 91 - 96
[23] Integrating Lattice-Free MMI Into End-to-End Speech Recognition
Tian, Jinchuan
Yu, Jianwei
Weng, Chao
Zou, Yuexian
Yu, Dong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 (25-38) : 25 - 38
[24] Acoustic model bootstrapping using semi-supervised learning
Chen, Langzhou
Leutnant, Volker
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, 2019-September : 3198 - 3202
[25] Acoustic Model Bootstrapping Using Semi-Supervised Learning
Chen, Langzhou
Leutnant, Volker
INTERSPEECH 2019, 2019, : 3198 - 3202
[26] A COMPARISON OF LATTICE-FREE DISCRIMINATIVE TRAINING CRITERIA FOR PURELY SEQUENCE-TRAINED NEURAL NETWORK ACOUSTIC MODELS
Weng, Chao
Yu, Dong
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6430 - 6434
[27] Semi-supervised self-training of object detection models
Rosenberg, C
Hebert, M
Schneiderman, H
WACV 2005: SEVENTH IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION, PROCEEDINGS, 2005, : 29 - 36
[28] SEMI-SUPERVISED ACOUSTIC EVENT DETECTION BASED ON TRI-TRAINING
Shi, Bowen
Sun, Ming
Kao, Chieh-Chi
Rozgic, Viktor
Matsoukas, Spyros
Wang, Chao
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 750 - 754
[29] Semi-supervised acoustic model training for speech with code-switching
Yilmaz, Emre
McLaren, Mitchell
van den Heuvel, Henk
van Leeuwen, David A.
SPEECH COMMUNICATION, 2018, 105 : 12 - 22
[30] Semi-supervised Training of Acoustic Models Leveraging Knowledge Transferred from Out-of-Domain Data
Lo, Tien-Hong
Chen, Berlin
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1400 - 1404

← 1 2 3 4 5 →