A Robust Pitch Extractor Based on DTW Lines and CASA with Application in Noisy Speech Recognition

被引：0

作者：

Morales-Cordovilla, Juan A. ^{[1
]}

Cabanas-Molero, Pablo

Peinado, Antonio M. ^{[1
]}

Sanchez, Victoria ^{[1
]}

机构：

[1] Univ Granada, Dept Teoria Senal Telemat & Comunicac, E-18071 Granada, Spain

来源：

ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES | 2012年 / 328卷

关键词：

pitch extractor; pitch line; CASA; DTW; noise; robust speech recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a robust pitch extractor with application in Automatic Speech Recognition and based on selecting pitch lines of a tonegram (a representation of the different pitch energies at each frame time). First, the tonegram and its maximum energy regions are extracted and a Dynamic Time Warping algorithm finds the most energetic trajectories or pitch lines from these regions. A second stage estimates the tonegram of the most energetic lines by applying Computational Auditory Scene Analysis rules which reject and group octave-related lines. The mean pitch of the speaker is estimated and the final pitch is estimated by rejecting lines which are outside from the mean pitch. The proposed pitch extractor is evaluated in a novel way - by means of the word accuracy of a Missing Data recognizer on Aurora-2 database.

引用

页码：197 / 206

页数：10

共 12 条

[1] Decoding speech in the presence of other sources
Barker, JP
Cooke, MP
Ellis, DPW
[J]. SPEECH COMMUNICATION, 2005, 45 (01) : 5 - 25
[2] Robust automatic speech recognition with missing and unreliable acoustic data
Cooke, M
Green, P
Josifovski, L
Vizinho, A
[J]. SPEECH COMMUNICATION, 2001, 34 (03) : 267 - 285
[3] YIN, a fundamental frequency estimator for speech and music
de Cheveigné, A
Kawahara, H
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) : 1917 - 1930
[4] Gonzalez S., 2011, EUSIPCO
[5] Exploiting correlogram structure for robust speech recognition with multiple speech sources
Ma, Ning
Green, Phil
Barker, Jon
Coy, Andre
[J]. SPEECH COMMUNICATION, 2007, 49 (12) : 874 - 891
[6] Morales-Cordovilla J.A., 2011, THESIS U GRANADA SPA
[7] Morales-Cordovilla JA, 2011, INT CONF ACOUST SPEE, P4808
[8] Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition
Morales-Cordovilla, Juan A.
Peinado, Antonio M.
Sanchez, Victoria
Gonzalez, Jose A.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (03): : 640 - 651
[9] Pearce D., 2000, P 6 INT C SPOK LANG, V4, P29
[10] Peinado A., 2006, SPEECH RECOGNITION D

← 1 2 →