Multi-channel neural audio decorrelation using generative adversarial networks

被引:0
作者
Anemueller, Carlotta [1 ]
Thiergart, Oliver [1 ]
Habets, Emanuel A. P. [1 ]
机构
[1] Int Audio Labs Erlangen, Wolfsmantel 33, D-91058 Erlangen, Germany
来源
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2024年 / 2024卷 / 01期
关键词
Multi-channel audio decorrelation; Generative adversarial networks; Convolutional neural networks; REPRODUCTION; SIGNALS;
D O I
10.1186/s13636-024-00378-y
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The degree of correlation between the sounds received by the ears significantly influences the spatial perception of a sound image. Audio signal decorrelation is, therefore, a commonly used tool in various spatial audio rendering applications. In this paper, we propose a multi-channel extension of a previously proposed decorrelation method based on generative adversarial networks. A separate generator network is employed for each output channel. All generator networks are optimized jointly to obtain a multi-channel output signal with the desired properties. The training objective includes a number of individual loss terms to control both the input-output and the inter-channel correlation as well as the quality of the individual output channels. The proposed approach is trained on music signals and evaluated both objectively and through formal listening tests. Thereby, a comparison with two classical signal processing-based multi-channel decorrelators is performed. Additionally, the influence of the number of output channels, the individual loss term weightings, and the employed training data on the proposed method's performance is investigated.
引用
收藏
页数:14
相关论文
共 33 条
  • [1] Alary B., 2017, P 20 INT C DIG AUD E, P405
  • [2] NEURAL AUDIO DECORRELATION USING GENERATIVE ADVERSARIAL NETWORKS
    Anemuller, Carlotta
    Thiergart, Oliver
    Habets, Emanuel A. P.
    [J]. 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [3] A Data-Driven Approach to Audio Decorrelation
    Anemuller, Carlotta
    Thiergart, Oliver
    Habets, Emanuel A. P.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2477 - 2481
  • [4] [Anonymous], 1988, 3253E
  • [5] Beaton R.J., 1996, Collected Papers on Digital Audio Bit-Rate Reduction., P126
  • [6] SPATIAL-MAPPING OF INTRACRANIAL AUDITORY EVENTS FOR VARIOUS DEGREES OF INTERAURAL COHERENCE
    BLAUERT, J
    LINDEMANN, W
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1986, 79 (03) : 806 - 813
  • [7] Boueri M., 2004, P AES 117 CONV
  • [8] OBJECTIVE MEASURES OF LISTENER ENVELOPMENT
    BRADLEY, JS
    SOULODRE, GA
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1995, 98 (05) : 2590 - 2597
  • [9] Breebaart J., 2007, Spatial Audio Processing: MPEG Surround and Other Applications, DOI [10.1002/9780470723494, DOI 10.1002/9780470723494]
  • [10] Canfield-Dafilou E.K., 2018, P AES 144 CONV