Multi-Channel to Multi-Channel Noise Reduction and Reverberant Speech Preservation in Time-Varying Acoustic Scenes for Binaural Reproduction

被引:0
作者
Lugasi, Moti [1 ]
Donley, Jacob [2 ]
Menon, Anjali [2 ]
Tourbabin, Vladimir [2 ]
Rafaely, Boaz [1 ]
机构
[1] Ben Gurion Univ Negev, Sch Elect, Comp Engn, IL-84105 Beer Sheva, Israel
[2] Meta, Real Labs Res, Menlo Pk, CA 94025 USA
关键词
Covariance matrices; Noise; Microphone arrays; Indexes; Spatial audio; Acoustic distortion; Transfer functions; Array processing; binaural reproduction; covariance matrix estimation; moving source; multi-channel Wiener filter; noise reduction; spatial audio enhancement; ENHANCEMENT; ALGORITHMS; APPROXIMATION; SOUND;
D O I
10.1109/TASLP.2024.3416668
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Real-life acoustic scenes may be recorded with microphone arrays for spatial audio applications, especially for the purpose of reproducing binaural signals for headphone listening. However, the presence of noise and interference may necessitate preprocessing to enhance the desired signal and improve the listener experience. Various methods have been developed to reduce noise while preserving the desired signal component with minimal distortion. The additional challenges posed by time-varying acoustic scenes are commonly addressed by segmenting the recorded signals into short time frames. Then, the short-time Fourier transform (STFT) is employed with multi-channel Wiener filter (MWF) and assuming the multiplicative transfer function (MTF) approximation. This approximation may not apply in the presence of long reverberation times and/or short STFT frames, so alternative techniques are required. This paper explores MWF-based enhancement in time-varying acoustic scenes where the MTF approximation is inapplicable, both analytically and experimentally with normal-hearing listeners. The investigated scene comprises a single desired source in a reverberant environment, and the impact of frame length and acoustic parameters on the rank of the spatial covariance matrix is studied. It is revealed that superior results in terms of reduced distortion and improved listener experience are achieved when using a full-rank spatial covariance matrix.
引用
收藏
页码:3283 / 3295
页数:13
相关论文
共 57 条
[1]   IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].
ALLEN, JB ;
BERKLEY, DA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950
[2]  
Arons B., 1992, Journal of the American Voice I/O Society, V12, P35
[3]   On multiplicative transfer function approximation in the short-time Fourier transform domain [J].
Avargel, Yekutiel ;
Cohen, Israel .
IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (05) :337-340
[4]   AUDIO SIGNAL PROCESSING FOR TELEPRESENCE BASED ON WEARABLE ARRAY IN NOISY AND DYNAMIC SCENES [J].
Beit-On, Hanan ;
Lugasi, Moti ;
Madmoni, Lior ;
Menon, Anjali ;
Kumar, Anurag ;
Donley, Jacob ;
Tourbabin, Vladimir ;
Rafaely, Boaz .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :8797-8801
[5]  
Bernschutz B., 2013, P 40 IT ANN C AC 39, P29
[6]  
Borrelli C, 2018, INT WORKSH ACOUSTIC, P451, DOI 10.1109/IWAENC.2018.8521364
[7]   Performance Analysis of Multichannel Wiener Filter-Based Noise Reduction in Hearing Aids Under Second Order Statistics Estimation Errors [J].
Cornelis, Bram ;
Moonen, Marc ;
Wouters, Jan .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05) :1368-1381
[8]  
Crochiere L., 1983, Multi-Rate Signal Processing
[9]   Multimicrophone noise reduction using recursive GSVD-based optimal filtering with ANC postprocessing stage [J].
Doclo, S ;
Moonen, M .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (01) :53-69
[10]  
Donley J., 2021, arXiv