A pitch determination and voiced/unvoiced decision algorithm for noisy speech

被引:46
|
作者
Rouat, J
Liu, YC
Morissette, D
机构
基金
加拿大自然科学与工程研究理事会;
关键词
auditory model; car speech; telephone speech; multi-channel selection; Teager energy operator; amplitude modulation; residue pitch;
D O I
10.1016/S0167-6393(97)00002-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The design of a pitch tracking system for noisy speech is a challenging and yet unsolved issue due to the association of ''traditional'' pitch determination problems with those of noise processing. We have developed a multi-channel pitch determination algorithm (PDA) that has been tested on three speech databases (0 dB SNR telephone speech, speech recorded in a car and clean speech) involving fifty-eight speakers. Our system has been compared to a multi-channel PDA based on auditory modelling (AMPEX), to hand-labelled and to Laryngograph pitch contours. Our PDA is comprised of an automatic channel selection module and a pitch extraction module that relies on a pseudo-periodic histogram (combination of normalised scalar products for the less corrupted channels) in order to find pitch. Our PDA excelled in performance over the reference system on 0 dB telephone and car speech. The automatic selection of channels was effective on the very noisy telephone speech (0 dB) but performed less significantly on car speech where the robustness of the system is mainly due to the pitch extraction module in comparison to AMPEX, This paper reports in details the voiced/unvoiced, unvoiced/voiced performance and pitch estimation errors for the proposed PDA and the reference system while utilising three speech databases.
引用
收藏
页码:191 / 207
页数:17
相关论文
共 1 条
  • [1] Modelling speaker-size discrimination with voiced and unvoiced speech sounds based on the effect of spectral lift
    Matsui, Toshie
    Irino, Toshio
    Uemura, Ryo
    Yamamoto, Kodai
    Kawahara, Hideki
    Patterson, Roy D.
    SPEECH COMMUNICATION, 2022, 136 : 23 - 41