Estimation of the instantaneous pitch of speech

被引:30
|
作者
Resch, Barbara [1 ]
Nilsson, Mattias [1 ]
Ekman, Anders [1 ]
Kleijn, W. Bastiaan [1 ]
机构
[1] Royal Inst Technol, Sound & Image Proc Lab, KTH, S-10044 Stockholm, Sweden
关键词
instantaneous pitch; pitch estimation; pitch-synchronous processing; splines;
D O I
10.1109/TASL.2006.885242
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
An accurate estimation of the pitch is essential for many speech processing applications, such as speech synthesis, speech coding, and speech enhancement. A widely used assumption in most common pitch estimation methods is that pitch is constant over a segment of short duration. This assumption does not apply in reality and leads to inaccurate pitch estimates. In this paper, we present a method for continuous pitch estimation that is able to track fast changes. In the presented framework, the pitch is modeled by a B-spline expansion and optimized in a multistage procedure for increased robustness. The performance of the continuous optimization procedure is compared to state-of-the-art pitch estimation methods and is evaluated both for artificial speech-like signals with known pitch, and for real speech signals. The results of the experiments show that our method leads to a higher accuracy of the estimate of the pitch than state-of-the-art methods.
引用
收藏
页码:813 / 822
页数:10
相关论文
共 50 条
  • [41] Instantaneous PSD Estimation for Speech Enhancement based on Generalized Principal Components
    Dietzen, Thomas
    Moonen, Marc
    van Waterschoot, Toon
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 191 - 195
  • [42] Instantaneous Fundamental Frequency Estimation With Optimal Segmentation for Nonstationary Voiced Speech
    Norholm, Sidsel Marie
    Jensen, Jesper Rindom
    Christensen, Mads Graesboll
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2354 - 2367
  • [43] Pitch Estimation in Noisy Speech Using Accumulated Peak Spectrum and Sparse Estimation Technique
    Huang, Feng
    Lee, Tan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (01): : 97 - 107
  • [44] Pitch period estimation algorithms for speech signals using wavelet transforms
    Walker, SL
    Foo, SY
    International Conference on Computing, Communications and Control Technologies, Vol 5, Proceedings, 2004, : 142 - 144
  • [45] A Study on the Robustness of Pitch Range Estimation from Brief Speech Segments
    Peng, Wenjie
    Fu, Kaiqi
    Zhang, Wei
    Xie, Yanlu
    Zhang, Jinsong
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 172 - 176
  • [46] Complex-Domain Pitch Estimation Algorithm for Narrowband Speech Signals
    Hosoda, Yuya
    Kawamura, Arata
    Iiguni, Youji
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2067 - 2078
  • [47] The third order cumulant of speech signals with application to reliable pitch estimation
    Nemer, E
    Goubran, R
    Mahmoud, S
    NINTH IEEE SIGNAL PROCESSING WORKSHOP ON STATISTICAL SIGNAL AND ARRAY PROCESSING, PROCEEDINGS, 1998, : 427 - 430
  • [48] On the estimation of pitch of noisy speech based on time and frequency domain representations
    Shahnaz, C.
    Zhu, W. -P.
    Ahmad, M. O.
    2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 1741 - 1744
  • [49] Joint pitch and voicing estimation for multiband excitation and sinusoidal speech coders
    Jia, WH
    Chan, WY
    THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 210 - 213
  • [50] Pitch Estimation in Noisy Speech Based on Temporal Accumulation of Spectrum Peaks
    Huang, Feng
    Lee, Tan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 641 - 644