A Single-Input/Binaural-Output Antiphasic Speech Enhancement Method for Speech Intelligibility Improvement

被引：5

作者：

Pan, Ningning ^{[1
,2
]}

Wang, Yuzhu ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

Benesty, Jacob ^{[3
]}

机构：

[1] Northwestern Polytech Univ, CIAIC, Xian 710072, Peoples R China

[2] Northwestern Polytech Univ, Shaanxi Prov Key Lab Artificial Intelligence, Xian 710072, Peoples R China

[3] Univ Quebec, INRS EMT, Montreal, PQ H5A 1K6, Canada

来源：

IEEE SIGNAL PROCESSING LETTERS | 2021年 / 28卷

基金：

美国国家科学基金会;

关键词：

Convolution; Rendering (computer graphics); Ear; Speech enhancement; Training; Noise measurement; Decoding; Antiphasic rendering; binaural; speech enhancement; deep learning; modified rhyme test; intelligibility; CHANNEL NOISE-REDUCTION; HEARING;

D O I：

10.1109/LSP.2021.3095016

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Improving intelligibility of a speech signal of interest from its observations (with a single microphone) corrupted by additive noise has long been a challenging problem. Motivated by important findings achieved in the psychoacoustic field, we propose in this work a deep learning based method to render the noise and desired speech in the perceptual space such that the perception of the desired speech is least affected by the noise. Specifically, we adopt the temporal convolutional network (TCN) based structure to map the single-channel noisy observations into two binaural signals, one for the left ear and the other for the right ear. The TCN is trained in such a way that the desired speech and noise will be perceived to be in opposite directions when the listener listens to the binaural signals. This antiphasic binaural presentation enables the listener to better distinguish the desired speech from the annoying noise for improved speech intelligibility. The modified rhyme test is performed for evaluation and the results justify the superiority of the proposed method for speech intelligibility improvement.

引用

页码：1445 / 1449

页数：5

共 50 条

[1] [Anonymous], 2009, METH MEAS INT SPEECH METH MEAS INT SPEECH
[2] Ba Jimmy Lei, 2016, Layer Nor- malization
[3] Bacila B., 2019, J. Audio Eng. Soc.
[4] Benesty J, 2011, SPRBRIEF ELECT, P1, DOI 10.1007/978-3-642-19601-0
[5] Benesty J, 2011, SPEECH ENHANCEMENT S
[6] Benesty J., 2005, Speech Enhancement
[7] Benesty J., 2014, Speech Enhancement: A Signal Subspace Perspective
[8] Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners
Beutelmann, Rainer
Brand, Thomas
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (01) : 331 - 342
[9] SPATIAL-MAPPING OF INTRACRANIAL AUDITORY EVENTS FOR VARIOUS DEGREES OF INTERAURAL COHERENCE
BLAUERT, J
LINDEMANN, W
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 79 (03) : 806 - 813
[10] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120

← 1 2 3 4 5 →