Noise robust voice activity detection using joint phase and magnitude based feature enhancement

被引:0
|
作者
Khomdet Phapatanaburi
Longbiao Wang
Zeyan Oo
Weifeng Li
Seiichi Nakagawa
Masahiro Iwahashi
机构
[1] Nagaoka University of Technology,Tianjin Key Laboratory of Cognitive Computing and Application
[2] School of Computer Science and Technology,Graduate School at Shenzhen
[3] Tianjin University,undefined
[4] Tsinghua University,undefined
[5] Toyohashi University of Technology,undefined
来源
Journal of Ambient Intelligence and Humanized Computing | 2017年 / 8卷
关键词
Deep neural network (DNN); Phase information; Noise-robust VAD; Feature enhancement;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, deep neural network (DNN)-based feature enhancement has been proposed for many speech applications. DNN-enhanced features have achieved higher performance than raw features. However, phase information is discarded during most conventional DNN training. In this paper, we propose a DNN-based joint phase- and magnitude -based feature (JPMF) enhancement (JPMF with DNN) and a noise-aware training (NAT)-DNN-based JPMF enhancement (JPMF with NAT-DNN) for noise-robust voice activity detection (VAD). Moreover, to improve the performance of the proposed feature enhancement, a combination of the scores of the proposed phase- and magnitude-based features is also applied. Specifically, mel-frequency cepstral coefficients (MFCCs) and the mel-frequency delta phase (MFDP) are used as magnitude and phase features. The experimental results show that the proposed feature enhancement significantly outperforms the conventional magnitude-based feature enhancement. The proposed JPMF with NAT-DNN method achieves the best relative equal error rate (EER), compared with individual magnitude- and phase-based DNN speech enhancement. Moreover, the combined score of the enhanced MFCC and MFDP using JPMF with NAT-DNN further improves the VAD performance.
引用
收藏
页码:845 / 859
页数:14
相关论文
共 50 条
  • [21] Distant-talking Speech Recognition Based on Multi-objective Learning using Phase and Magnitude-based Feature
    Li, Dongbo
    Wang, Longbiao
    Dang, Jianwu
    Ge, Meng
    Guan, Haotian
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 394 - 398
  • [22] Detection of Android Malware Based on Deep Forest and Feature Enhancement
    Zhang, Xueqin
    Wang, Jiyuan
    Xu, Jinyu
    Gu, Chunhua
    IEEE ACCESS, 2023, 11 : 29344 - 29359
  • [23] Pedestrian detection based on attention mechanism and feature enhancement with SSD
    Feng, T. T.
    Ge, H. Y.
    2020 5TH INTERNATIONAL CONFERENCE ON COMMUNICATION, IMAGE AND SIGNAL PROCESSING (CCISP 2020), 2020, : 145 - 148
  • [24] Feature Descriptor Enhancement for Loop Detection Based on Metric Learning
    Han B.
    Luo L.
    Liu X.
    Shen H.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (01): : 51 - 61
  • [25] Salient Object Detection Based on Feature Enhancement in Complex Scene
    Li B.
    Rao H.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2021, 49 (11): : 135 - 144
  • [26] Design of hand detection based on attention and feature enhancement pyramids
    Li, Jiao
    Sun, Haodong
    Qiao, Yang
    Li, Zhongyu
    Ran, Sijie
    Sun, Xuecheng
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (03)
  • [27] ELECTROENCEPHALOGRAPHY FEATURE ENHANCEMENT BASED ON ELECTRODE ACTIVITY RATIO FOR IDENTIFICATION
    Albasri, Ahmed
    Abdali-Mohammadi, Fardin
    Fathi, Abdolhossein
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2020, 20 (04)
  • [28] Desert Noise Suppression for Seismic Data Based on Feature Enhancement Denoising Network
    Li, Juan
    An, Ran
    Li, Yue
    Zhao, Yuxing
    IZVESTIYA-PHYSICS OF THE SOLID EARTH, 2021, 57 (06) : 935 - 949
  • [29] Desert Noise Suppression for Seismic Data Based on Feature Enhancement Denoising Network
    Juan Li
    Ran An
    Yue Li
    Yuxing Zhao
    Izvestiya, Physics of the Solid Earth, 2021, 57 : 935 - 949
  • [30] Voice Activity Detection Using an Adaptive Context Attention Model
    Kim, Juntae
    Hahn, Minsoo
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (08) : 1181 - 1185