Open Source German Distant Speech Recognition: Corpus and Acoustic Model

被引:17
|
作者
Radeck-Arneth, Stephan [1 ,2 ]
Milde, Benjamin [1 ]
Lange, Arvid [1 ,2 ]
Gouvea, Evandro
Radomski, Stefan [1 ]
Muehlhaeuser, Max [1 ]
Biemann, Chris [1 ]
机构
[1] Tech Univ Darmstadt, Dept Comp Sci, Language Technol Grp, Darmstadt, Germany
[2] Tech Univ Darmstadt, Dept Comp Sci, Telecooperat Grp, Darmstadt, Germany
来源
TEXT, SPEECH, AND DIALOGUE (TSD 2015) | 2015年 / 9302卷
关键词
German speech recognition; Open source; Speech corpus; Distant speech recognition; Speaker-independent;
D O I
10.1007/978-3-319-24033-6_54
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new freely available corpus for German distant speech recognition and report speaker-independent word error rate (WER) results for two open source speech recognizers trained on this corpus. The corpus has been recorded in a controlled environment with three different microphones at a distance of one meter. It comprises 180 different speakers with a total of 36 hours of audio recordings. We show recognition results with the open source toolkit Kaldi (20.5% WER) and PocketSphinx (39.6% WER) and make a complete open source solution for German distant speech recognition possible.
引用
收藏
页码:480 / 488
页数:9
相关论文
共 50 条
  • [1] Developing an Open-Source Corpus of Yoruba Speech
    Gutkin, Alexander
    Demirsahin, Isin
    Kjartansson, Oddur
    Rivera, Clara
    Tnbastin, Kola
    INTERSPEECH 2020, 2020, : 404 - 408
  • [2] Neural Blind Source Separation and Diarization for Distant Speech Recognition
    Bando, Yoshiaki
    Nakamura, Tomohiko
    Watanabe, Shinji
    INTERSPEECH 2024, 2024, : 722 - 726
  • [3] HYBRID ACOUSTIC MODELS FOR DISTANT AND MULTICHANNEL LARGE VOCABULARY SPEECH RECOGNITION
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 285 - 290
  • [4] Acoustic Event Mixing to Multichannel AMI Data for Distant Speech Recognition and Acoustic Event Classification Benchmarking
    Astapov, Sergei
    Svirskiy, Gleb
    Lavrentyev, Aleksandr
    Prisyach, Tatyana
    Popov, Dmitriy
    Ubskiy, Dmitriy
    Kabarov, Vladimir
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 31 - 42
  • [5] THE DIRHA-ENGLISH CORPUS AND RELATED TASKS FOR DISTANT-SPEECH RECOGNITION IN DOMESTIC ENVIRONMENTS
    Ravanelli, Mirco
    Cristoforetti, Luca
    Gretter, Roberto
    Pellin, Marco
    Sosi, Alessandro
    Omologo, Maurizio
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 275 - 282
  • [6] SPRAAK: an open source "SPeech Recognition and Automatic Annotation Kit"
    Demuynck, Kris
    Roelens, Jan
    Van Compernolle, Dirk
    Wambacq, Patrick
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 495 - 495
  • [7] NEURAL NETWORKS FOR DISTANT SPEECH RECOGNITION
    Renals, Steve
    Swietojanski, Pawel
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 172 - 176
  • [8] On distant speech recognition for home automation
    Vacher, Michel
    Lecouteux, Benjamin
    Portet, François
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, 8700 : 161 - 188
  • [9] Speaker Verification Performance Evaluation Based on Open Source Speech Processing Software and TIMIT Speech Corpus
    Klosowski, Piotr
    Dustor, Adam
    Izydorczyk, Jacek
    COMPUTER NETWORKS, CN 2015, 2015, 522 : 400 - 409
  • [10] Real-time blind source separation system with applications to distant speech recognition
    Ferreira, Alberto E. A.
    Alarcao, Diogo
    APPLIED ACOUSTICS, 2016, 113 : 170 - 184