Spectral Analysis for Automatic Speech Recognition and Enhancement

被引：4

作者：

Oruh, Jane ^{[1
]}

Viriri, Serestina ^{[1
]}

机构：

[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, ZA-4000 Durban, South Africa

来源：

MACHINE LEARNING FOR NETWORKING, MLN 2020 | 2021年 / 12629卷

关键词：

Noise reduction; STFT filtering; Spectrum estimation; Automatic speech recognition; Speech enhancement; Signal-to-Noise-Ratio; STFT;

D O I：

10.1007/978-3-030-70866-5_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurate recognition of noisy speech signal is still an obstacle for wider application of speech recognition technology. The robustness of a speech recognition system is heavily influenced by the ability to handle the presence of background noise. In this work, a Short Time Fourier Transform (STFT) filtering technique for the enhancement and recognition of the speech signal is presented. Conventionally, STFT filtering has been applied in speech analysis. However, in this study the combination of modified STFT with Adaptive window width based on the Chirp Rate, termed ASTFT, in conjunction with Spectrogram Features is proposed for optimal speech recognition and enhancement. LibriSpeech ASR Corpus is the benchmark dataset for this experiment. The spectrum from the enhanced Speech signal is estimated using several spectrogram features to obtain a unit peak amplitude. Priori Signal-to-Noise Ratio (SNR) estimation is performed on the modified STFT speech signal, and it achieved an SNR of 31.86 dB which is considered to be an effectively clean speech signal.

引用

页码：245 / 254

页数：10

共 24 条

[1]

Ahmadizadeh M, 2014, ADV STRUCTURAL DYNAM

[2]

[Anonymous], 2015, P 14 PYTH SCI C SCIP

[3]

Athaley P.D.A, 2017, INT J TREND SCI RES, V1, P289

[4]

Cohen I, 2010, SPRINGER TOP SIGN PR, V3, P1, DOI 10.1007/978-3-642-11130-3

[5]

Creative Commons, CREAT COMM ATTR 4 0

[6] Adaptive short-time Fourier analysis [J].

Czerwinski, RN ;

Jones, DL .

IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (02) :42-45

[7]

Delcroix M., 2014, REV WORKSH

[8]

Dutta A., 2016, INT J COMPUT SCI COM, V7, P126

[9]

Gutierrez-Osuna Ricardo., 2016, Introduction to Speech Processing

[10]

icsi.berkeley, INT COMP SCI I ICSI

← 1 2 3 →