Noise robust voice activity detection using joint phase and magnitude based feature enhancement

被引:0
|
作者
Khomdet Phapatanaburi
Longbiao Wang
Zeyan Oo
Weifeng Li
Seiichi Nakagawa
Masahiro Iwahashi
机构
[1] Nagaoka University of Technology,Tianjin Key Laboratory of Cognitive Computing and Application
[2] School of Computer Science and Technology,Graduate School at Shenzhen
[3] Tianjin University,undefined
[4] Tsinghua University,undefined
[5] Toyohashi University of Technology,undefined
来源
Journal of Ambient Intelligence and Humanized Computing | 2017年 / 8卷
关键词
Deep neural network (DNN); Phase information; Noise-robust VAD; Feature enhancement;
D O I
暂无
中图分类号
学科分类号
摘要
Recently, deep neural network (DNN)-based feature enhancement has been proposed for many speech applications. DNN-enhanced features have achieved higher performance than raw features. However, phase information is discarded during most conventional DNN training. In this paper, we propose a DNN-based joint phase- and magnitude -based feature (JPMF) enhancement (JPMF with DNN) and a noise-aware training (NAT)-DNN-based JPMF enhancement (JPMF with NAT-DNN) for noise-robust voice activity detection (VAD). Moreover, to improve the performance of the proposed feature enhancement, a combination of the scores of the proposed phase- and magnitude-based features is also applied. Specifically, mel-frequency cepstral coefficients (MFCCs) and the mel-frequency delta phase (MFDP) are used as magnitude and phase features. The experimental results show that the proposed feature enhancement significantly outperforms the conventional magnitude-based feature enhancement. The proposed JPMF with NAT-DNN method achieves the best relative equal error rate (EER), compared with individual magnitude- and phase-based DNN speech enhancement. Moreover, the combined score of the enhanced MFCC and MFDP using JPMF with NAT-DNN further improves the VAD performance.
引用
收藏
页码:845 / 859
页数:14
相关论文
共 50 条
  • [1] Noise robust voice activity detection using joint phase and magnitude based feature enhancement
    Phapatanaburi, Khomdet
    Wang, Longbiao
    Oo, Zeyan
    Li, Weifeng
    Nakagawa, Seiichi
    Iwahashi, Masahiro
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (06) : 845 - 859
  • [2] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
    Oo, Zeyan
    Kawakami, Yuta
    Wang, Longbiao
    Nakagawa, Seiichi
    Xiao, Xiong
    Iwahashi, Masahiro
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
  • [3] PHASE AWARE DEEP NEURAL NETWORK FOR NOISE ROBUST VOICE ACTIVITY DETECTION
    Wang, Longbiao
    Phapatanaburi, Khomdet
    Oo, Zeyan
    Nakagawa, Seiichi
    Iwahashi, Masahiro
    Dang, Jianwu
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1087 - 1092
  • [4] Feature Enhancement for Joint Human and Head Detection
    Zhang, Yongming
    Zhang, Shifeng
    Zhuang, Chubin
    Lei, Zhen
    BIOMETRIC RECOGNITION (CCBR 2019), 2019, 11818 : 511 - 518
  • [5] Feature Enhancement With Joint Use of Consecutive Corrupted and Noise Feature Vectors With Discriminative Region Weighting
    Suzuki, Masayuki
    Yoshioka, Takuya
    Watanabe, Shinji
    Minematsu, Nobuaki
    Hirose, Keikichi
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2172 - 2181
  • [6] FEATURE ENHANCEMENT BASED ON GENERATIVE-DISCRIMINATIVE HYBRID APPROACH WITH GMMS AND DNNS FOR NOISE ROBUST SPEECH RECOGNITION
    Fujimoto, Masakiyo
    Nakatani, Tomohio
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5019 - 5023
  • [7] Spatiotemporal Feature Enhancement Network for Blur Robust Underwater Object Detection
    Zhou, Hao
    Qi, Lu
    Huang, Hai
    Yang, Xu
    Yang, Jing
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (05) : 1814 - 1828
  • [8] Robust lane line segmentation based on group feature enhancement
    Gao, Xin
    Bai, Hanlin
    Xiong, Yijin
    Bao, Zefeng
    Zhang, Guoying
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [9] Robust lane line segmentation based on group feature enhancement
    Gao, Xin
    Bai, Hanlin
    Xiong, Yijin
    Bao, Zefeng
    Zhang, Guoying
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [10] FESSD:SSD target detection based on feature fusion and feature enhancement
    Qian, Huaming
    Wang, Huilin
    Feng, Shuai
    Yan, Shuya
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (01)