Single-Channel Speech Enhancement With Phase Reconstruction Based on Phase Distortion Averaging

被引:38
作者
Wakabayashi, Yukoh [1 ]
Fukumori, Takahiro [2 ]
Nakayama, Masato [2 ]
Nishiura, Takanobu [2 ]
Yamashita, Yoichi [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Informat Sci & Engn, Kusatsu 5258577, Japan
[2] Ritsumeikan Univ, Coll Informat Sci & Engn, Kusatsu 5258577, Japan
基金
日本学术振兴会;
关键词
Phase reconstruction; speech enhancement; phase distortion; harmonic structure; fundamental frequency; SPECTRAL COEFFICIENTS; AMPLITUDE; SUPPRESSION; NOISE;
D O I
10.1109/TASLP.2018.2831632
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech enhancement has been widely investigated for several decades, but by modifying only the amplitude spectrum of a speech signal, ignoring the phase spectrum, which has been regarded as an unimportant feature. However, it was recently reported that the phase spectrum plays an important role in speech quality and intelligibility. In this paper, we propose a phase reconstruction method based on harmonic enhancement using the fundamental frequency and phase distortion feature. This feature is known to show fluctuations in the phase spectrum with respect to time and frequency. We estimate the speech phase spectrum by considering the relationship between harmonic phase spectra. Experimental evaluations indicate that the proposed phase reconstruction method improves speech quality in various noisy environments.
引用
收藏
页码:1559 / 1569
页数:11
相关论文
共 36 条
[31]  
Takeda K., 1987, EUROPEAN C SPEECH TE, V2, P13
[32]  
VARGA A, 1992, TECH REP
[33]  
Wakabayashi Y, 2017, INT CONF ACOUST SPEE, P5560, DOI 10.1109/ICASSP.2017.7953220
[34]   THE UNIMPORTANCE OF PHASE IN SPEECH ENHANCEMENT [J].
WANG, DL ;
LIM, JS .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1982, 30 (04) :679-681
[35]  
Williamson DS, 2016, INT CONF ACOUST SPEE, P5220, DOI 10.1109/ICASSP.2016.7472673
[36]   Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement [J].
Wolfe, PJ ;
Godsill, SJ .
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2003, 2003 (10) :1043-1051