Noise robust voice activity detection using joint phase and magnitude based feature enhancement

被引：0

作者：

Khomdet Phapatanaburi

Longbiao Wang

Zeyan Oo

Weifeng Li

Seiichi Nakagawa

Masahiro Iwahashi

机构：

[1] Nagaoka University of Technology,Tianjin Key Laboratory of Cognitive Computing and Application

[2] School of Computer Science and Technology,Graduate School at Shenzhen

[3] Tianjin University,undefined

[4] Tsinghua University,undefined

[5] Toyohashi University of Technology,undefined

来源：

Journal of Ambient Intelligence and Humanized Computing | 2017年 / 8卷

关键词：

Deep neural network (DNN); Phase information; Noise-robust VAD; Feature enhancement;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recently, deep neural network (DNN)-based feature enhancement has been proposed for many speech applications. DNN-enhanced features have achieved higher performance than raw features. However, phase information is discarded during most conventional DNN training. In this paper, we propose a DNN-based joint phase- and magnitude -based feature (JPMF) enhancement (JPMF with DNN) and a noise-aware training (NAT)-DNN-based JPMF enhancement (JPMF with NAT-DNN) for noise-robust voice activity detection (VAD). Moreover, to improve the performance of the proposed feature enhancement, a combination of the scores of the proposed phase- and magnitude-based features is also applied. Specifically, mel-frequency cepstral coefficients (MFCCs) and the mel-frequency delta phase (MFDP) are used as magnitude and phase features. The experimental results show that the proposed feature enhancement significantly outperforms the conventional magnitude-based feature enhancement. The proposed JPMF with NAT-DNN method achieves the best relative equal error rate (EER), compared with individual magnitude- and phase-based DNN speech enhancement. Moreover, the combined score of the enhanced MFCC and MFDP using JPMF with NAT-DNN further improves the VAD performance.

引用

页码：845 / 859

页数：14

共 50 条

[1] Noise robust voice activity detection using joint phase and magnitude based feature enhancement
Phapatanaburi, Khomdet
Wang, Longbiao
Oo, Zeyan
Li, Weifeng
Nakagawa, Seiichi
Iwahashi, Masahiro
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (06) : 845 - 859
[2] DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification
Oo, Zeyan
Kawakami, Yuta
Wang, Longbiao
Nakagawa, Seiichi
Xiao, Xiong
Iwahashi, Masahiro
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2204 - 2208
[3] PHASE AWARE DEEP NEURAL NETWORK FOR NOISE ROBUST VOICE ACTIVITY DETECTION
Wang, Longbiao
Phapatanaburi, Khomdet
Oo, Zeyan
Nakagawa, Seiichi
Iwahashi, Masahiro
Dang, Jianwu
2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1087 - 1092
[4] Feature Enhancement for Joint Human and Head Detection
Zhang, Yongming
Zhang, Shifeng
Zhuang, Chubin
Lei, Zhen
BIOMETRIC RECOGNITION (CCBR 2019), 2019, 11818 : 511 - 518
[5] Feature Enhancement With Joint Use of Consecutive Corrupted and Noise Feature Vectors With Discriminative Region Weighting
Suzuki, Masayuki
Yoshioka, Takuya
Watanabe, Shinji
Minematsu, Nobuaki
Hirose, Keikichi
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2172 - 2181
[6] FEATURE ENHANCEMENT BASED ON GENERATIVE-DISCRIMINATIVE HYBRID APPROACH WITH GMMS AND DNNS FOR NOISE ROBUST SPEECH RECOGNITION
Fujimoto, Masakiyo
Nakatani, Tomohio
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5019 - 5023
[7] Spatiotemporal Feature Enhancement Network for Blur Robust Underwater Object Detection
Zhou, Hao
Qi, Lu
Huang, Hai
Yang, Xu
Yang, Jing
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (05) : 1814 - 1828
[8] Robust lane line segmentation based on group feature enhancement
Gao, Xin
Bai, Hanlin
Xiong, Yijin
Bao, Zefeng
Zhang, Guoying
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
[9] Robust lane line segmentation based on group feature enhancement
Gao, Xin
Bai, Hanlin
Xiong, Yijin
Bao, Zefeng
Zhang, Guoying
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
[10] FESSD:SSD target detection based on feature fusion and feature enhancement
Qian, Huaming
Wang, Huilin
Feng, Shuai
Yan, Shuya
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (01)

← 1 2 3 4 5 →