Open Source German Distant Speech Recognition: Corpus and Acoustic Model

被引：17

作者：

Radeck-Arneth, Stephan ^{[1
,2
]}

Milde, Benjamin ^{[1
]}

Lange, Arvid ^{[1
,2
]}

Gouvea, Evandro

Radomski, Stefan ^{[1
]}

Muehlhaeuser, Max ^{[1
]}

Biemann, Chris ^{[1
]}

机构：

[1] Tech Univ Darmstadt, Dept Comp Sci, Language Technol Grp, Darmstadt, Germany

[2] Tech Univ Darmstadt, Dept Comp Sci, Telecooperat Grp, Darmstadt, Germany

来源：

TEXT, SPEECH, AND DIALOGUE (TSD 2015) | 2015年 / 9302卷

关键词：

German speech recognition; Open source; Speech corpus; Distant speech recognition; Speaker-independent;

D O I：

10.1007/978-3-319-24033-6_54

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio recordings. We show recognition results with the open source toolkit Kaldi (20.5% WER) and PocketSphinx (39.6% WER) and make a complete open source solution for German distant speech recognition possible.

引用

页码：480 / 488

页数：9

共 50 条

[1] Developing an Open-Source Corpus of Yoruba Speech
Gutkin, Alexander
Demirsahin, Isin
Kjartansson, Oddur
Rivera, Clara
Tnbastin, Kola
INTERSPEECH 2020, 2020, : 404 - 408
[2] Neural Blind Source Separation and Diarization for Distant Speech Recognition
Bando, Yoshiaki
Nakamura, Tomohiko
Watanabe, Shinji
INTERSPEECH 2024, 2024, : 722 - 726
[3] HYBRID ACOUSTIC MODELS FOR DISTANT AND MULTICHANNEL LARGE VOCABULARY SPEECH RECOGNITION
Swietojanski, Pawel
Ghoshal, Arnab
Renals, Steve
2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 285 - 290
[4] Acoustic Event Mixing to Multichannel AMI Data for Distant Speech Recognition and Acoustic Event Classification Benchmarking
Astapov, Sergei
Svirskiy, Gleb
Lavrentyev, Aleksandr
Prisyach, Tatyana
Popov, Dmitriy
Ubskiy, Dmitriy
Kabarov, Vladimir
SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 31 - 42
[5] THE DIRHA-ENGLISH CORPUS AND RELATED TASKS FOR DISTANT-SPEECH RECOGNITION IN DOMESTIC ENVIRONMENTS
Ravanelli, Mirco
Cristoforetti, Luca
Gretter, Roberto
Pellin, Marco
Sosi, Alessandro
Omologo, Maurizio
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 275 - 282
[6] SPRAAK: an open source "SPeech Recognition and Automatic Annotation Kit"
Demuynck, Kris
Roelens, Jan
Van Compernolle, Dirk
Wambacq, Patrick
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 495 - 495
[7] NEURAL NETWORKS FOR DISTANT SPEECH RECOGNITION
Renals, Steve
Swietojanski, Pawel
2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 172 - 176
[8] On distant speech recognition for home automation
Vacher, Michel
Lecouteux, Benjamin
Portet, François
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, 8700 : 161 - 188
[9] Speaker Verification Performance Evaluation Based on Open Source Speech Processing Software and TIMIT Speech Corpus
Klosowski, Piotr
Dustor, Adam
Izydorczyk, Jacek
COMPUTER NETWORKS, CN 2015, 2015, 522 : 400 - 409
[10] Real-time blind source separation system with applications to distant speech recognition
Ferreira, Alberto E. A.
Alarcao, Diogo
APPLIED ACOUSTICS, 2016, 113 : 170 - 184

← 1 2 3 4 5 →