A Discrete Wavelet Transform-Based Voice Activity Detection and Noise Classification With Sub-Band Selection

被引:1
|
作者
Abdullah, Salinna [1 ]
Zamani, Majid [1 ]
Demosthenous, Andreas [1 ]
机构
[1] UCL, Dept Elect & Elect Engn, Torrington Pl, London WC1E 7JE, England
基金
英国工程与自然科学研究理事会;
关键词
Discrete wavelet transform; mel-frequency cepstral coefficients; multilayer perceptron; noise classification; sub-band selection; voice activity detection; SPEECH ENHANCEMENT;
D O I
10.1109/ISCAS51556.2021.9401647
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A real-time discrete wavelet transform-based adaptive voice activity detector and sub-band selection for feature extraction are proposed for noise classification, which can be used in a speech processing pipeline. The voice activity detection and sub-band selection rely on wavelet energy features and the feature extraction process involves the extraction of mel-frequency cepstral coefficients from selected wavelet sub-bands and mean absolute values of all sub-bands. The method combined with a feedforward neural network with two hidden layers could be added to speech enhancement systems and deployed in hearing devices such as cochlear implants. In comparison to the conventional short-time Fourier transform-based technique, it has higher F-1 scores and classification accuracies (with a mean of 0.916 and 90.1%, respectively) across five different noise types (babble, factory, pink, Volvo (car) and white noise), a significantly smaller feature set with 21 features, reduced memory requirement, faster training convergence and about half the computational cost.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Discrete wavelet transform-based simple range classification strategies for fractal image coding
    Xing-Yuan Wang
    Dou-Dou Zhang
    Nonlinear Dynamics, 2014, 75 : 439 - 448
  • [32] Wavelet Packet Sub-band Based Classification of Alcoholic and Controlled State EEG Signals
    Puri, D.
    Ingle, R.
    Kachare, P.
    Patil, M.
    Awale, R.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING 2016 (ICCASP 2016), 2017, 137 : 571 - 576
  • [33] Robust voice-activity detection based on the wavelet transform
    Stegmann, J
    Schroder, G
    1997 IEEE WORKSHOP ON SPEECH CODING FOR TELECOMMUNICATIONS, PROCEEDINGS: BACK TO BASICS: ATTACKING FUNDAMENTAL PROBLEMS IN SPEECH CODING, 1997, : 99 - 100
  • [34] A new algorithm for voice activity detection based on wavelet transform
    Jiang, SJ
    Guo, HT
    Yin, FL
    PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 222 - 225
  • [35] Deep Image Compression in the Wavelet Transform Domain Based on High Frequency Sub-Band Prediction
    Yang, Chuxi
    Zhao, Yan
    Wang, Shigang
    IEEE ACCESS, 2019, 7 : 52484 - 52497
  • [36] Nonlinear Features of Bark Wavelet Sub-band Filtering for Pathological Voice Recognition
    Zhang, Xiao-Jun
    Zhu, Xin-Cheng
    Wu, Di
    Xiao, Zhong-Zhe
    Tao, Zhi
    Zhao, He-Ming
    ENGINEERING LETTERS, 2021, 29 (01) : 49 - 60
  • [37] Epileptic Electroencephalogram Classification using Relative Wavelet Sub-band Energy and Wavelet Entropy
    Hadiyoso, S.
    Irawati, I. D.
    Rizal, A.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2021, 34 (01): : 75 - 81
  • [38] Image compression method based on the sub-band similarity of different scales by wavelet transform decomposition
    Yu, XH
    Wavelet Analysis and Active Media Technology Vols 1-3, 2005, : 191 - 197
  • [39] Optimal sub-band and strength selection for blind watermarking in wavelet domain
    Ishtiaq, M.
    Jaffar, M. A.
    Choi, T. -S.
    IMAGING SCIENCE JOURNAL, 2014, 62 (03): : 171 - 177
  • [40] Obstructive sleep apnea detection using discrete wavelet transform-based statistical features
    Rajesh, Kandala. N. V. P. S.
    Dhuli, Ravindra
    Kumar, T. Sunil
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 130