Extraction of voiced regions of speech from emotional speech signals using wavelet-pitch method

被引:0
|
作者
Dendukuri L.S. [1 ]
Hussain S.J. [1 ]
机构
[1] Department of Electronics and Communication Engineering, Vignan's Foundation for Science, Technology and Research, Guntur, Andhra Pradesh, Vadlamudi
关键词
Autocorrelation; Emotional speech; Pitch; Thresholding; Wavelets;
D O I
10.3311/PPEE.15373
中图分类号
学科分类号
摘要
Extraction of voiced regions of speech is one of the latest topics in speech domain for various speech applications. Emotional speech signals contain most of the information in voiced regions of speech. In this particular work, voiced regions of speech are extracted from emotional speech signals using wavelet-pitch method. Daubechies wavelet (Db4) is applied on the speech frames after downsampling the speech signals. Autocorrelation function is performed on the extracted approximation coefficients of each speech frame and corresponding pitch values are obtained. A local threshold is defined on obtained pitch values to extract voiced regions. The threshold values are different for male and female speakers, as male pitch values are low compared to the female pitch values in general. The obtained pitch values are scaled down and are compared with the thresholds to extract the voiced frames. The transition frames between the voiced and unvoiced frames are also extracted if the previous frame is voiced frame, to preserve the emotional content in extracted frames. The extracted frames are reshaped to have desired emotional speech signal. Signal to Noise Ratio (SNR), Normalized Root Mean Square Error (NRMSE) and statistical parameters are used as evaluation metrics. This particular work provides better SNR and Normalized Root Mean Square Error values compared to the zero crossing-energy and residual signal based methods in voiced region extraction. Db4 wavelet provides better results compared to Haar and Db2 wavelets in extracting voiced regions using wavelet-pitch method from emotional speech signals. © 2021 Budapest University of Technology and Economics. All rights reserved.
引用
收藏
页码:262 / 278
页数:16
相关论文
共 50 条
  • [1] Real-time pitch extraction of voiced speech
    George, DE
    Salari, E
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 1997, 20 (04) : 379 - 387
  • [2] Real-time pitch extraction of voiced speech
    Dept of Physics and Astronomy, University of Toledo, Toledo, OH 43606-3390, United States
    不详
    J Network Comput Appl, 4 (379-387):
  • [3] A new method for automatic extraction of the voiced unvoiced feature from Chinese continuous speech using wavelet transform
    Hu, HT
    Du, LM
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 686 - 689
  • [4] Speech enhancement using voiced speech probability based wavelet decomposition
    Bhowmick, Anirban
    Chandra, Mahesh
    COMPUTERS & ELECTRICAL ENGINEERING, 2017, 62 : 706 - 718
  • [5] A Method for Pitch Estimation from Noisy Speech Signals Based on a Pitch-Harmonic Extraction
    Shahnaz, C.
    Zhu, W. -P.
    Ahmad, M. O.
    2008 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 120 - 123
  • [6] DELTA MODULATION OF PITCH, FORMANT, AND AMPLITUDE SIGNALS FOR SYNTHESIS OF VOICED SPEECH
    JAYANT, NS
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (03): : 135 - 140
  • [7] New algorithm for pitch detection of speech signals using wavelet transform
    Zhong, Jinhong
    Yang, Shanlin
    Lin, Yirong
    Lu, Kui
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2002, 24 (09):
  • [8] Pitch period estimation algorithms for speech signals using wavelet transforms
    Walker, SL
    Foo, SY
    International Conference on Computing, Communications and Control Technologies, Vol 5, Proceedings, 2004, : 142 - 144
  • [9] Second generation wavelet transform-based pitch period estimation and voiced/unvoiced decision for speech signals
    Erçelebi, E
    APPLIED ACOUSTICS, 2003, 64 (01) : 25 - 41
  • [10] Speech De-noising using Wavelet based Methods with Focus on Classification of Speech into Voiced, Unvoiced and Silence Regions
    Baishya, Anamika
    Kumar, Priyatam
    2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 419 - 424