Extraction of Fundamental Frequency From Degraded Speech Using Temporal Envelopes at High SNR Frequencies

被引：18

作者：

Aneeja, G. ^{[1
]}

Yegnanarayana, B. ^{[2
]}

机构：

[1] Int Inst Informat Technol, Hyderabad 500032, Andhra Pradesh, India

[2] Birla Inst Technol & Sci, Hyderabad 500078, Andhra Pradesh, India

来源：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2017年 / 25卷 / 04期

关键词：

Correlogram; dominant frequency; fundamental frequency; pitch period; single frequency filtering; weighted component envelope; PITCH DETECTION; MULTIPITCH TRACKING; NOISY; ALGORITHM; ROBUST;

D O I：

10.1109/TASLP.2017.2666425

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper we propose a method for extracting the fundamental frequency (f(o)) from degraded speech signals using single frequency filtering (SFF) approach. The SFF of frequencyshifted speech signal gives high signal-to-noise ratio (SNR) segments at some frequencies and hence the SFF approach can be exploited for f(o) extraction using autocorrelation function of those segments. Since the f(o) is computed from the envelope of a single frequency component of the signal, the vocal tract resonances do not affect the f(o) extraction. The use of the high SNR frequency component in a given segment helps in overcoming the effects of degradations in the speech signal, without explicitly estimating the characteristics of noise. The proposed method of fo extraction is shown to give better performance for several types of real and simulated degradations, in comparison with some of the methods reported recently in the literature.

引用

页码：829 / 838

页数：10

共 38 条

[1] Single Frequency Filtering Approach for Discriminating Speech and Nonspeech [J].

Aneeja, G. ;

Yegnanarayana, B. .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) :705-717

[2]

[Anonymous], 1995, Speech coding and synthesis

[3]

Assmann PF, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P889

[4]

Bagshaw P., 1993, Proc. Eurospeech, P1003

[5]

Camacho A., 2007, THESIS

[6] SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy Conditions [J].

Chu, Wei ;

Alwan, Abeer .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03) :933-944

[7]

Chu W, 2009, INT CONF ACOUST SPEE, P3969, DOI 10.1109/ICASSP.2009.4960497

[8] YIN, a fundamental frequency estimator for speech and music [J].

de Cheveigné, A ;

Kawahara, H .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930

[9] Manipulations of fundamental and formant frequencies influence the attractiveness of human male voices [J].

Feinberg, DR ;

Jones, BC ;

Little, AC ;

Burt, DM ;

Perrett, DI .

ANIMAL BEHAVIOUR, 2005, 69 :561-568

[10] Coding of the fundamental frequency in continuous interleaved sampling processors for cochlear implants [J].

Geurts, L ;

Wouters, J .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (02) :713-726

← 1 2 3 4 →