A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation

被引:1
作者
Kawahara, Hideki [1 ]
Sakakibara, Ken-Ichi [2 ]
Morise, Masanori [3 ]
Banno, Hideki [4 ]
Toda, Tomoki [5 ]
机构
[1] Wakayama Univ, Wakayama, Japan
[2] Hlth Sci Univ Hokkaido, Tobetsu, Hokkaido, Japan
[3] Univ Yamanashi, Yamanashi, Japan
[4] Meijo Univ, Nagoya, Aichi, Japan
[5] Nagoya Univ, Grad Sch Informat Sci, Nagoya, Aichi, Japan
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
关键词
speech analysis; fundamental frequency; aperiodicity; instantaneous frequency; group delay; INSTANTANEOUS-FREQUENCY; SPEECH; PITCH; REPRESENTATIONS; DECOMPOSITION; EXCITATION; WINDOWS; SIGNAL;
D O I
10.21437/Interspeech.2017-436
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a simple and linear SNR (strictly speaking. periodic to random power ratio) estimator (0 dB to 80 dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide variety of time windowing functions with very low sidelobe levels. The estimate combines the frequency derivative and the time-frequency derivative of the mapping from filter center frequency to the output instantaneous frequency. This procedure can replace the periodicity detection and aperiodicity estimation subsystems of recently introduced open source vocoder, YANG vocoder. Source code of MAT LAB implementation of this method will also be open sourced.
引用
收藏
页码:424 / 428
页数:5
相关论文
共 38 条