ADVERSARIAL SEMI-SUPERVISED AUDIO SOURCE SEPARATION APPLIED TO SINGING VOICE EXTRACTION

被引:0
|
作者
Stoller, Daniel [1 ]
Ewert, Sebastian [2 ]
Dixon, Simon [1 ]
机构
[1] Queen Mary Univ London, London, England
[2] Spotify, Luxembourg, Luxembourg
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
基金
英国工程与自然科学研究理事会;
关键词
Source separation; Deep neural networks; Adversarial training; Semi-supervised learning;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals. Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples. This way, we can also use unpaired source and mixture recordings without the drawbacks of creating unrealistic music mixtures. Our framework is widely applicable as it does not assume a specific network architecture or number of sources. To our knowledge, this is the first adoption of adversarial training for music source separation. In a prototype experiment for singing voice separation, separation performance increases with our approach compared to purely supervised training.
引用
收藏
页码:2391 / 2395
页数:5
相关论文
共 50 条
  • [1] SEMI-SUPERVISED SINGING VOICE SEPARATION WITH NOISY SELF-TRAINING
    Wang, Zhepei
    Giri, Ritwik
    Isik, Umut
    Valin, Jean-Marc
    Krishnaswamy, Arvindh
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 31 - 35
  • [2] Semi-supervised Audio Source Separation based on the Iterative Estimation and Extraction of Note Events
    Castro, Alejandro Delgado
    Szymanski, John E.
    PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON E-BUSINESS AND TELECOMMUNICATIONS, VOL 1: DCNET, ICE-B, OPTICS, SIGMAP AND WINSYS (ICETE), 2019, : 273 - 279
  • [3] SEMI-SUPERVISED MONAURAL SINGING VOICE SEPARATION WITH A MASKING NETWORK TRAINED ON SYNTHETIC MIXTURES
    Michelashvili, Michael
    Benaim, Sagie
    Wolf, Lior
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 291 - 295
  • [4] DISCRIMINATIVE AND RECONSTRUCTIVE BASIS TRAINING FOR AUDIO SOURCE SEPARATION WITH SEMI-SUPERVISED NONNEGATIVE MATRIX FACTORIZATION
    Kitamura, Daichi
    Ono, Nobutaka
    Saruwatari, Hiroshi
    Takahashi, Yu
    Kondo, Kazunobu
    2016 IEEE INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2016,
  • [5] Singing Voice Detection via Similarity-based Semi-supervised Learning
    Chen, Xi
    Gao, Yongwei
    Li, Wei
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
  • [6] Adversarial Dropout for Supervised and Semi-Supervised Learning
    Park, Sungrae
    Park, JunKeon
    Shin, Su-Jin
    Moon, Il-Chul
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3917 - 3924
  • [7] INTERACTIVE REFINEMENT OF SUPERVISED AND SEMI-SUPERVISED SOUND SOURCE SEPARATION ESTIMATES
    Bryan, Nicholas J.
    Mysore, Gautham J.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 883 - 887
  • [8] Adversarial Multi-Teacher Distillation for Semi-Supervised Relation Extraction
    Li, Wanli
    Qian, Tieyun
    Li, Xuhui
    Zou, Lixin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11291 - 11301
  • [9] Adversarial Transformations for Semi-Supervised Learning
    Suzuki, Teppei
    Sato, Ikuro
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5916 - 5923
  • [10] Semi-Supervised Adversarial Variational Autoencoder
    Zemouri, Ryad
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2020, 2 (03): : 361 - 378