Pitch estimation using models of voiced speech on three levels

被引：0

作者：

Joho, Dominik ^{[1
]}

Bennewitz, Maren ^{[1
]}

Behnke, Sven ^{[1
]}

机构：

[1] Univ Freiburg, Dept Comp Sci, D-7800 Freiburg, Germany

来源：

2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年

关键词：

pitch estimation; speech analysis; matrix decomposition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present an algorithm for estimating the fundamental frequency in speech signals. Our approach incorporates models of voiced speech on three levels. First, we estimate the pitch for each time frame based on its harmonic structure using non-negative matrix factorization. The second level utilizes temporal pitch continuity to extract partial pitch contours. Thirdly, we incorporate statistics of the succession of voiced segments to aggregate partial contours to the final contour of an utterance. We evaluate our approach on the Keele database. The experimental results show the robustness of our method for noisy speech, and the good performance for clean speech in comparison with state-of-the-art algorithms.

引用

页码：1077 / +

页数：2

共 50 条

[31] Estimation of the instantaneous pitch of speech
Resch, Barbara
Nilsson, Mattias
Ekman, Anders
Kleijn, W. Bastiaan
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 813 - 822
[32] Joint estimation of the voiced component and spectrum of a speech signal
Holmes, W.Harvey
Malik, Najam
Conference Record / IEEE Global Telecommunications Conference, 1998, 3 : 1315 - 1319
[33] Research on real-time detection technology of Chinese voiced speech pitch
College of Information Science and Engineering, Zhejiang Normal University, Jinhua 321004, China
Yi Qi Yi Biao Xue Bao, 2006, SUPPL. (1713-1715):
[34] A review of lumped-element models of voiced speech
Erath, Byron D.
Zanartu, Matias
Stewart, Kelley C.
Plesniak, Michael W.
Sommer, David E.
Peterson, Sean D.
SPEECH COMMUNICATION, 2013, 55 (05) : 667 - 690
[35] A new linear predictive method for spectral estimation of voiced speech
Alku, P
Varho, S
ISCAS '97 - PROCEEDINGS OF 1997 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS I - IV: CIRCUITS AND SYSTEMS IN THE INFORMATION AGE, 1997, : 2649 - 2652
[36] The DYPSA algorithm for estimation of glottal closure instants in voiced speech
Kounoudes, A
Naylor, PA
Brookes, M
2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 349 - 352
[37] New technique for the estimation of jitter and shimmer of voiced speech signal
Shahnaz, C.
Zhu, W. -P.
Ahmad, M. O.
2006 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-5, 2006, : 2239 - +
[38] All-pole model parameter estimation for voiced speech
Murthi, MN
Rao, BD
1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 17 - 18
[39] PITCH ESTIMATION ALGORITHM FOR SPEECH AND MUSIC
TUCKER, WH
BATES, RHT
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (06): : 597 - 604
[40] EFFICIENT PITCH ESTIMATION FOR SPEECH AND MUSIC
TUCKER, WH
BATES, RHT
ELECTRONICS LETTERS, 1977, 13 (12) : 357 - 358

← 1 2 3 4 5 →