Direct F0 Estimation with Neural-Network-based Regression

被引:5
|
作者
Xu, Shuzhuang [1 ]
Shimodaira, Hiroshi [2 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
来源
关键词
fundamental frequency; pitch tracking; neural network; PITCH; TRACKING;
D O I
10.21437/Interspeech.2019-3267
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Pitch tracking, or the continuous extraction of fundamental frequency from speech waveforms, is of vital importance to many applications in speech analysis and synthesis. Many existing trackers, including conventional ones such as Praat, RAPT and YIN, and newly proposed neural-network-based ones such as DNN-CLS, CREPE and RNN-REG, have conducted an extensive investigation into speech pitch tracking. This work developed a different end-to-end regression model based on neural networks, where a voice detector and a newly proposed value estimator work jointly to highlight the trajectory of fundamental frequency. Experiments on the PTDB-TUG corpus showed that the system surpasses canonical neural networks in terms of gross error rate. It further outperformed conventional trackers under clean condition and neural-network classifiers under noisy condition by the NOISEX-92 corpus.
引用
收藏
页码:1995 / 1999
页数:5
相关论文
共 50 条
  • [1] NEURAL-NETWORK-BASED F0 TEXT-TO-SPEECH SYNTHESIZER FOR MANDARINE
    HWANG, SH
    CHEN, SH
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1994, 141 (06): : 384 - 390
  • [2] DATA AUGMENTATION STRATEGIES FOR NEURAL NETWORK F0 ESTIMATION
    Airaksinen, Manu
    Juvela, Lauri
    Alku, Paavo
    Rasanen, Okko
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6485 - 6489
  • [3] Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks
    Kato, Akihiro
    Kinnunen, Tomi H.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2336 - 2349
  • [4] Neural-network-based parameter estimation for quantum detection
    Ban, Yue
    Echanobe, Javier
    Ding, Yongcheng
    Puebla, Ricardo
    Casanova, Jorge
    QUANTUM SCIENCE AND TECHNOLOGY, 2021, 6 (04)
  • [5] Neural-network-based time-delay estimation
    Shaltaf, S
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (03) : 378 - 385
  • [6] Neural-Network-Based Time-Delay Estimation
    Samir Shaltaf
    EURASIP Journal on Advances in Signal Processing, 2004
  • [7] REGIONAL CEREBRAL BLOOD-FLOW ESTIMATION BY NEURAL-NETWORK-BASED PARAMETRIC REGRESSION-ANALYSIS
    WU, FY
    SLATER, JD
    INTERNATIONAL JOURNAL OF BIO-MEDICAL COMPUTING, 1993, 33 (02): : 119 - 128
  • [8] Estimation of the radii of the scalar/isoscalar mesons f0(980), f0(1300) and broad state f0(1530+90-250)
    Anisovich, VV
    Bugg, DV
    Sarantsev, AV
    PHYSICS LETTERS B, 1998, 437 (1-2) : 209 - 217
  • [9] Model Counting Meets F0 Estimation
    Pavan, A.
    Vinodchandran, N. V.
    Bhattacharyya, Arnab
    Meel, Kuldeep S.
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2023, 48 (03):
  • [10] Model Counting meets F0 Estimation
    Pavan, A.
    Vinodchandran, N. V.
    Bhattacharyya, Arnab
    Meel, Kuldeep S.
    PODS '21: PROCEEDINGS OF THE 40TH SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2021, : 299 - 311