RPCA-DRNN technique for monaural singing voice separation

被引:8
|
作者
Lai, Wen-Hsing [1 ]
Wang, Siou-Lin [2 ]
机构
[1] Natl Kaohsiung Univ Sci & Technol, Dept Comp & Commun Engn, Kaohsiung 824005, Taiwan
[2] Natl Kaohsiung Univ Sci & Technol, Coll Engn, PhD Program Engn Sci & Technol, Kaohsiung 824005, Taiwan
关键词
Singing separation; Robust principal component analysis; Deep recurrent neural network; Stacked recurrent neural network; FACTORIZATION; OPTIMIZATION; ENHANCEMENT; CONTINUITY; MODELS;
D O I
10.1186/s13636-022-00236-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we propose a methodology for separating a singing voice from musical accompaniment in a monaural musical mixture. The proposed method uses robust principal component analysis (RPCA), followed by postprocessing, including median filter, morphology, and high-pass filter, to decompose the mixture. Subsequently, a deep recurrent neural network comprising two jointly optimized parallel-stacked recurrent neural networks (sRNNs) with mask layers and trained on limited data and computation is applied to the decomposed components to optimize the final estimated separated singing voice and background music to further correct misclassified or residual singing and background music in the initial separation. The experimental results of MIR-1K, ccMixter, and MUSDB18 datasets and the comparison with ten existing techniques indicate that the proposed method achieves competitive performance in monaural singing voice separation. On MUSDB18, the proposed method reaches the comparable separation quality in less training data and lower computational cost compared to the other state-of-the-art technique.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] A Novel Singing Voice Separation Method Based on a Learnable Decomposition Technique
    Samira Mavaddati
    Circuits, Systems, and Signal Processing, 2020, 39 : 3652 - 3681
  • [22] A Novel Singing Voice Separation Method Based on a Learnable Decomposition Technique
    Mavaddati, Samira
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (07) : 3652 - 3681
  • [23] MONAURAL SINGING VOICE SEPARATION WITH SKIP-FILTERING CONNECTIONS AND RECURRENT INFERENCE OF TIME-FREQUENCY MASK
    Mimilakis, Stylianos Ioannis
    Drossos, Konstantinos
    Santos, Joao F.
    Schuller, Gerald
    Virtanen, Tuomas
    Bengio, Yoshua
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 721 - 725
  • [24] Hybrid binaural singing voice separation
    Kasak, Peter
    Jarina, Roman
    Jakubec, Maros
    Ticha, Dasa
    2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
  • [25] HTMD-Net: A Hybrid Masking-Denoising Approach to Time-Domain Monaural Singing Voice Separation
    Garoufis, Christos
    Zlatintsi, Athanasia
    Maragos, Petros
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 341 - 345
  • [26] Singing Voice Separation and Pitch Extraction from Monaural Polyphonic Audio Music Via DNN and Adaptive Pitch Tracking
    Fan, Zhe-Cheng
    Jang, Jyh-Shing Roger
    Lu, Chung-Li
    2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 178 - 185
  • [27] Monaural Singing Voice Separation by Non-negative Matrix Partial Co-Factorization with Temporal Continuity and Sparsity Criteria
    Hu, Ying
    Wang, Liejun
    Huang, Hao
    Zhou, Gang
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2016, PT III, 2016, 9773 : 33 - 43
  • [28] SinTechSVS: A Singing Technique Controllable Singing Voice Synthesis System
    Zhao, Junchuan
    Chetwin, Low Qi Hong
    Wang, Ye
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2641 - 2653
  • [29] COMPLEX RATIO MASKING FOR SINGING VOICE SEPARATION
    Zhang, Yixuan
    Liu, Yuzhou
    Wang, DeLiang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 41 - 45
  • [30] SINGING VOICE SEPARATION: A STUDY ON TRAINING DATA
    Pretet, Laure
    Hennequin, Romain
    Royo-Letelier, Jimena
    Vaglio, Andrea
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 506 - 510