PAN: PHONEME-AWARE NETWORK FOR MONAURAL SPEECH ENHANCEMENT

被引:0
作者
Du, Zhihao [1 ]
Lei, Ming [2 ]
Han, Jiqing [1 ]
Zhang, Shiliang [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
[2] Alibaba Grp, Machine Intelligence Technol, Hangzhou, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
中国国家自然科学基金;
关键词
Monaural speech enhancement; phonetic posteriorgram; phoneme-aware network;
D O I
10.1109/icassp40776.2020.9054334
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Current methods for monaural speech enhancement only utilize acoustic information but seldom consider the phonetic information of an utterance. In the voice conversion community, significant progress has been achieved by using the phonetic information via the phonetic posteriorgrams (PPGs). Inspired by the progress, we propose a phoneme-aware network (PAN) to utilize the noisy PPGs for speech enhancement. Since the PPG prediction and speech enhancement benefit from each other, a PPG predictor is involved into the PAN and an iterative training algorithm is proposed for PAN. Experimental results show that the enhancement performance is improved by using the phonetic information in terms of speech intelligibility, perceptual quality and character error rate. To the best of our knowledge, this is the first time to introduce the PPG into speech enhancement.
引用
收藏
页码:6634 / 6638
页数:5
相关论文
共 29 条
  • [1] Speaker and Phoneme-Aware Speech Bandwidth Extension with Residual Dual-Path Network
    Hou, Nana
    Xu, Chenglin
    Van Tung Pham
    Zhou, Joey Tianyi
    Chng, Eng Siong
    Li, Haizhou
    INTERSPEECH 2020, 2020, : 4064 - 4068
  • [2] A Recursive Network with Dynamic Attention for Monaural Speech Enhancement
    Li, Andong
    Zheng, Chengshi
    Fan, Cunhang
    Peng, Renhua
    Li, Xiaodong
    INTERSPEECH 2020, 2020, : 2422 - 2426
  • [3] SpecMNet: Spectrum mend network for monaural speech enhancement
    Fan, Cunhang
    Zhang, Hongmei
    Yi, Jiangyan
    Lv, Zhao
    Tao, Jianhua
    Li, Taihao
    Pei, Guanxiong
    Wu, Xiaopei
    Li, Sheng
    APPLIED ACOUSTICS, 2022, 194
  • [4] Dilated convolutional recurrent neural network for monaural speech enhancement
    Pirhosseinloo, Shadi
    Brumberg, Jonathan S.
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 158 - 162
  • [5] Multi-scale informative perceptual network for monaural speech enhancement
    Lan, Tian
    Li, Jiajia
    Feng, Yujia
    Tai, Wenxin
    Wang, Yixiang
    Chen, Cong
    Kang, Jun
    Liu, Qiao
    APPLIED ACOUSTICS, 2022, 195
  • [6] COMPLEX SPECTRAL MAPPING WITH A CONVOLUTIONAL RECURRENT NETWORK FOR MONAURAL SPEECH ENHANCEMENT
    Tan, Ke
    Wang, DeLiang
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6865 - 6869
  • [7] Harmonic Attention for Monaural Speech Enhancement
    Wang, Tianrui
    Zhu, Weibin
    Gao, Yingying
    Zhang, Shilei
    Feng, Junlan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2424 - 2436
  • [8] PFRNet: Dual-Branch Progressive Fusion Rectification Network for Monaural Speech Enhancement
    Yu, Runxiang
    Zhao, Ziwei
    Ye, Zhongfu
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2358 - 2362
  • [9] Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement
    Zhang, Zehua
    Zhang, Lu
    Zhuang, Xuyi
    Qian, Yukun
    Wang, Mingjiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [10] Spatial information aided speech and noise feature discrimination for Monaural speech enhancement
    Xu, Xinmeng
    Li, Jizhen
    Zhang, Yiqun
    Tu, Weiping
    Yang, Yuhong
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269