Exploiting foreign resources for DNN-based ASR

被引：9

作者：

Motlicek, Petr ^{[1
]}

Imseng, David ^{[1
]}

Potard, Blaise ^{[1
]}

Garner, Philip N. ^{[1
]}

Himawan, Ivan ^{[1
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2015年

关键词：

Automatic speech recognition; Deep learning for speech; Acoustic model adaptation; Semi-supervised training; SPEECH; ALGORITHM; FEATURES;

D O I：

10.1186/s13636-015-0058-5

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance of foreign data characteristics, in particular domain and language, when using this data as an auxiliary data source for training ASR acoustic models based on deep neural networks (DNNs). The acoustic models are evaluated on a challenging bilingual database within the scope of the MediaParl project. Experimental results suggest that in-language (but out-of-domain) data is more beneficial than in-domain (but out-of-language) data when employed in either supervised or semi-supervised training of DNNs. The best performing ASR system, an HMM/GMM acoustic model that exploits DNN as a discriminatively trained feature extractor outperforms the best performing HMM/DNN hybrid by about 5 % relative (in terms of WER). An accumulated relative gain with respect to the MFCC-HMM/GMM baseline is about 30 % WER.

引用

页码：1 / 10

页数：10

共 50 条

[21] INTEGRATING DNN-BASED AND SPATIAL CLUSTERING-BASED MASK ESTIMATION FOR ROBUST MVDR BEAMFORMING
Nakatani, Tomohiro
To, Nobutaka
Higuchi, Takuya
Araki, Shoko
Kinoshita, Keisuke
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 286 - 290
[22] Enhancement of DNN-based multilabel classification by grouping labels based on data imbalance and label correlation
Chen, Ling
Wang, Yuhong
Li, Hao
PATTERN RECOGNITION, 2022, 132
[23] Semi-Supervised Training of DNN-Based Acoustic Model for ATC Speech Recognition
Smidl, Lubos
Svec, Jan
Prazak, Ales
Trmal, Jan
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 646 - 655
[24] Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition
Deng, Yu-Chih
Lin, Cheng-Hsin
Liao, Yuan-Fu
Wang, Yih-Ru
Chen, Sin-Horng
PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 134 - 138
[25] ONLINE INTEGRATION OF DNN-BASED AND SPATIAL CLUSTERING-BASED MASK ESTIMATION FOR ROBUST MVDR BEAMFORMING
Matsui, Yutaro
Nakatani, Tomohiro
Delcroix, Marc
Kinoshita, Keisuke
Ito, Nobutaka
Araki, Shoko
Makino, Shoji
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 71 - 75
[26] DNN-Based Surrogate Modeling-Based Feasible Performance Reliability Design Methodology for Aircraft Engine
Cao, Dalu
Bai, Guang-Chen
IEEE ACCESS, 2020, 8 : 229201 - 229218
[27] DNN adaptation by automatic quality estimation of ASR hypotheses
Falavigna, Daniele
Matassoni, Marco
Jalalvand, Shahab
Negri, Matteo
Turchi, Marco
COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 585 - 604
[28] CONSISTENT DNN UNCERTAINTY TRAINING AND DECODING FOR ROBUST ASR
Nathwani, Karan
Vincent, Emmanuel
Illina, Irina
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 185 - 192
[29] Stochastic DNN-HMM Training for Robust ASR
Lee, Kang Hyun
Kang, Woo Hyun
Lee, Hyeonseung
Kim, Nam Soo
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 177 - 182
[30] VOCAL TRACT LENGTH NORMALISATION APPROACHES TO DNN-BASED CHILDREN'S AND ADULTS' SPEECH RECOGNITION
Serizel, Romain
Giuliani, Diego
2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 135 - 140

← 1 2 3 4 5 →