PAN: PHONEME-AWARE NETWORK FOR MONAURAL SPEECH ENHANCEMENT

被引：0

作者：

Du, Zhihao ^{[1
]}

Lei, Ming ^{[2
]}

Han, Jiqing ^{[1
]}

Zhang, Shiliang ^{[2
]}

机构：

[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China

[2] Alibaba Grp, Machine Intelligence Technol, Hangzhou, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

中国国家自然科学基金;

关键词：

Monaural speech enhancement; phonetic posteriorgram; phoneme-aware network;

D O I：

10.1109/icassp40776.2020.9054334

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Current methods for monaural speech enhancement only utilize acoustic information but seldom consider the phonetic information of an utterance. In the voice conversion community, significant progress has been achieved by using the phonetic information via the phonetic posteriorgrams (PPGs). Inspired by the progress, we propose a phoneme-aware network (PAN) to utilize the noisy PPGs for speech enhancement. Since the PPG prediction and speech enhancement benefit from each other, a PPG predictor is involved into the PAN and an iterative training algorithm is proposed for PAN. Experimental results show that the enhancement performance is improved by using the phonetic information in terms of speech intelligibility, perceptual quality and character error rate. To the best of our knowledge, this is the first time to introduce the PPG into speech enhancement.

引用

页码：6634 / 6638

页数：5

共 29 条

[1] Speaker and Phoneme-Aware Speech Bandwidth Extension with Residual Dual-Path Network
Hou, Nana
Xu, Chenglin
Van Tung Pham
Zhou, Joey Tianyi
Chng, Eng Siong
Li, Haizhou
INTERSPEECH 2020, 2020, : 4064 - 4068
[2] A Recursive Network with Dynamic Attention for Monaural Speech Enhancement
Li, Andong
Zheng, Chengshi
Fan, Cunhang
Peng, Renhua
Li, Xiaodong
INTERSPEECH 2020, 2020, : 2422 - 2426
[3] SpecMNet: Spectrum mend network for monaural speech enhancement
Fan, Cunhang
Zhang, Hongmei
Yi, Jiangyan
Lv, Zhao
Tao, Jianhua
Li, Taihao
Pei, Guanxiong
Wu, Xiaopei
Li, Sheng
APPLIED ACOUSTICS, 2022, 194
[4] Dilated convolutional recurrent neural network for monaural speech enhancement
Pirhosseinloo, Shadi
Brumberg, Jonathan S.
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 158 - 162
[5] Multi-scale informative perceptual network for monaural speech enhancement
Lan, Tian
Li, Jiajia
Feng, Yujia
Tai, Wenxin
Wang, Yixiang
Chen, Cong
Kang, Jun
Liu, Qiao
APPLIED ACOUSTICS, 2022, 195
[6] COMPLEX SPECTRAL MAPPING WITH A CONVOLUTIONAL RECURRENT NETWORK FOR MONAURAL SPEECH ENHANCEMENT
Tan, Ke
Wang, DeLiang
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6865 - 6869
[7] Harmonic Attention for Monaural Speech Enhancement
Wang, Tianrui
Zhu, Weibin
Gao, Yingying
Zhang, Shilei
Feng, Junlan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2424 - 2436
[8] PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement
Yu, Runxiang
Zhao, Ziwei
Ye, Zhongfu
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2358 - 2362
[9] Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
Zhang, Zehua
Zhang, Lu
Zhuang, Xuyi
Qian, Yukun
Wang, Mingjiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
[10] Spatial information aided speech and noise feature discrimination for Monaural speech enhancement
Xu, Xinmeng
Li, Jizhen
Zhang, Yiqun
Tu, Weiping
Yang, Yuhong
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269

← 1 2 3 →