German Speech Recognition System using DeepSpeech

被引：3

作者：

Xu, Jiahua ^{[1
]}

Matta, Kaveen ^{[1
]}

Islam, Shaiful ^{[1
]}

Nuernberger, Andreas ^{[1
]}

机构：

[1] Otto von Guericke Univ, Magdeburg, Germany

来源：

2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020 | 2020年

关键词：

Deep learning; neural networks; speech-to-text; natural language processing;

D O I：

10.1145/3443279.3443313

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech recognition focus on the translation of speech from an audio format to a text. Popular models are available for the English language as open source in the domain of voice/speech recognition; however, German language open models and training schemes are rather rare. An end-to-end real-time German speech-to-text system based on multiple German language datasets is worthy of more attention and further investigation. In this paper, we combined multiple German datasets on the market and optimizes the Deepspeech for training a real-time German speech-to-text model. A GUI is also proposed for functionality demonstration. Our model performs considerably well compared to other state-of-the-art since we utilized noisy data to replicate real-life scenarios. We released our fully trained German model along with its parameter configurations to promote the diversification of the open-source model for the German language.

引用

页码：102 / 106

页数：5

共 16 条

[1]

AAshishG, 2019, AUTOMATIC SPEECH REC

[2]

Agarwal Aashish, 2019, PRELIMINARY P 15 C N

[3]

Bazel, 2020, BAZ SOFTW BUILD TEST

[4]

Chiu CC, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P4774, DOI 10.1109/ICASSP.2018.8462105

[5]

Common crawl, 2020, LANG MOD

[6]

DeepSpeech Deutsch, 2020, GERM DAT MOZ COMM VO

[7]

Google, 2018, TENSORFLOW SPEECH RE

[8]

Hannun A, 2014, Arxiv, DOI arXiv:1412.5567

[9]

KenLM Deutsch, 2020, DTSCH LANG MOD TOOLK

[10]

LibriVox Deutsch, 2020, GERM DAT LIBRIVOX

← 1 2 →