IFE: NN-aided Instantaneous Pitch Estimation

被引:0
|
作者
Blok, Marek [1 ]
Balla, Jan [1 ]
Pietrolaj, Mariusz [1 ]
机构
[1] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Gdansk, Poland
来源
2021 14TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION, HSI | 2021年
关键词
pitch estimation; machine learning; speech synthesis; data augmentation; neural network; IFE; TRACKING; ROBUST; SPEECH; YIN;
D O I
10.1109/HSI52170.2021.9538713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pitch estimation is still an open issue in contemporary signal processing research. Nowadays, growing momentum of machine learning techniques application in the data-driven society allows for tackling this problem from a new perspective. This work leverages such an opportunity to propose a refined Instantaneous Frequency and power based pitch Estimator method called IFE. It incorporates deep neural network based pitch estimation with audio front end used for extraction of instantaneous frequency and power of signal components. A thorough results analysis is performed and major advantages and shortcomings of this method are identified, leading to a wide array of suggestions for future improvement. While IFE exhibits an instantaneous temporal resolution, a comparison is made against state-of-the-art pitch estimators operating on time windows, proving a comparable degree of prediction accuracy (up to 6% accuracy improvement) while maintaining the advantage of higher temporal resolution.
引用
收藏
页码:78 / 84
页数:7
相关论文
共 50 条
  • [1] Estimation of the instantaneous pitch of speech
    Resch, Barbara
    Nilsson, Mattias
    Ekman, Anders
    Kleijn, W. Bastiaan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 813 - 822
  • [2] GC-Like LDPC Code Construction and its NN-Aided Decoder Implementation
    Hsu, Yu-Lun
    Liu, Li-Wei
    Liao, Yen-Chin
    Chang, Hsie-Chia
    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS, 2024, 5 : 189 - 198
  • [3] Pitch estimation by block and instantaneous methods
    Gavat I.
    Zirra M.
    Sabac B.
    International Journal of Speech Technology, 2002, 5 (3) : 269 - 279
  • [4] INSTANTANEOUS PITCH ESTIMATION BASED ON RAPT FRAMEWORK
    Azarov, Elias
    Vashkevich, Maxim
    Petrovsky, Alexander
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2787 - 2791
  • [5] An NN-Aided Near-and-Far-Field Classifier via Channel Hankelization in XL-MIMO Systems
    Kim, Jung-Hwan
    Kim, Dong-Hwan
    Ozger, Mustafa
    Lee, Woong-Hee
    IEEE ACCESS, 2024, 12 : 41934 - 41941
  • [6] INSTANTANEOUS COMPLEX FREQUENCY FOR PIPELINE PITCH ESTIMATION
    Kaniewska, Magdalena
    SPA 2010: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2010, : 83 - 88
  • [7] Instantaneous Pitch Estimation Based on Empirical Wavelet Transform
    Li, Yusheng
    Xue, Biao
    Hong, Hong
    Zhu, Xiaohua
    2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 250 - 253
  • [8] INSTANTANEOUS PITCH ESTIMATION ALGORITHM BASED ON MULTIRATE SAMPLING
    Azarov, Elias
    Vashkevich, Maxim
    Petrovsky, Alexander
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4970 - 4974
  • [9] Model-Based Estimation of Instantaneous Pitch in Noisy Speech
    Hong, Jung Ook
    Wolfe, Patrick J.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 100 - 103
  • [10] Instantaneous Pitch Estimation of Noisy Speech Signal with Multivariate SST
    Molla, Md Khademul Islam
    Qaosar, Mahboob
    Hirose, Keikichi
    2016 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2016, : 770 - 773