F0 estimation of speech based on IRAPT using WLP-based TV-CAR analysis

被引:0
|
作者
Shan, Wei [1 ]
Funaki, Keiichi [2 ]
机构
[1] Univ Ryukyus, Grad Sch Engn & Sci, Nishihara, Okinawa, Japan
[2] Univ Ryukyus, C&N Ctr, Nishihara, Okinawa, Japan
关键词
F-0; estimation; IRAPT; WLP; complex analysis; analytic signal; LINEAR PREDICTION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Fundamental frequency (F-0) estimation plays an important role in speech processing such as speech coding, synthesis, recognition and so on. Although a present F-0 estimation method performs well under clean condition, the performance deteriorates significantly in noisy environment. For this reason robust F-0 estimation against additive noise is demanded. We have previously proposed F-0 estimation methods based on Time-Varying Complex AR (TV-CAR) analysis whose criterion is the weighted correlation of the complex residual obtained by the TV-CAR analysis, sum of the harmonics for the complex residual spectrum, or so on. On the other hand, E. Azarov et al. have proposed an improved method of RAPT (Robust Algorithm for Pitch Tracking) using an instantaneous harmonics that is called IRAPT (Instantaneous RAPT). The IRAPT can perform better estimation than RAPT. Since IRAPT uses band-limited analytic signal to obtain harmonic frequencies, the complex residual signal obtained by the TV-CAR analysis can also be applied to the IRAPT. In this paper, novel F-0 estimation method using the instantaneous frequency based on the robust WLP (Weighted Linear Prediction) TV-CAR residual is proposed and evaluated.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Investigation of Prosodic F0 Layers in Hierarchical F0 Modeling for HMM-based Speech Synthesis
    Lei, Ming
    Wu, Yi-Jian
    Ling, Zhen-Hua
    Dai, Li-Rong
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 613 - +
  • [22] F0 ESTIMATION FOR NOISY SPEECH BASED ON EXPLORING LOCAL TIME-FREQUENCY SEGMENT
    Wang, Dongmei
    Hansen, John H. L.
    Tobey, Emily
    2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
  • [23] A Study of F0 Estimation Based on RAPT Framework using Sustained Vowel
    Karunaimathi, Prarthana, V
    Gladis, Dennis
    Dalvi, Usha
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 2290 - 2295
  • [24] ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION
    Kurth, Frank
    Cornaggia-Urrigshardt, Alessia
    Urrigshardt, Sebastian
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [25] F0 CONTOUR ESTIMATION USING PHONETIC FEATURE IN ELECTROLARYNGEAL SPEECH ENHANCEMENT
    Cai, Zexin
    Xu, Zhicheng
    Li, Ming
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6490 - 6494
  • [26] Review of F0 modelling and generation in HMM based speech synthesis
    Yu, Kai
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 599 - 604
  • [27] Extraction of important sentences for speech summarization based on an F0 model
    Inoue, Akira
    Yamashita, Yoichi
    Acoustical Science and Technology, 2003, 24 (01) : 35 - 37
  • [28] Direct F0 Estimation with Neural-Network-based Regression
    Xu, Shuzhuang
    Shimodaira, Hiroshi
    INTERSPEECH 2019, 2019, : 1995 - 1999
  • [29] Improving F0 Prediction Using Bidirectional Associative Memories and Syllable-Level F0 Features for HMM-based Mandarin Speech Synthesis
    Gao, Li
    Ling, Zhen-Hua
    Chen, Ling-Hui
    Dai, Li-Rong
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 275 - 279
  • [30] Asynchronous F0 and Spectrum Modeling for HMM-Based Speech Synthesis
    Wang, Cheng-Cheng
    Ling, Zhen-Hua
    Dai, Li-Rong
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 412 - 415