RPCA-DRNN technique for monaural singing voice separation

被引:8
|
作者
Lai, Wen-Hsing [1 ]
Wang, Siou-Lin [2 ]
机构
[1] Natl Kaohsiung Univ Sci & Technol, Dept Comp & Commun Engn, Kaohsiung 824005, Taiwan
[2] Natl Kaohsiung Univ Sci & Technol, Coll Engn, PhD Program Engn Sci & Technol, Kaohsiung 824005, Taiwan
关键词
Singing separation; Robust principal component analysis; Deep recurrent neural network; Stacked recurrent neural network; FACTORIZATION; OPTIMIZATION; ENHANCEMENT; CONTINUITY; MODELS;
D O I
10.1186/s13636-022-00236-9
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we propose a methodology for separating a singing voice from musical accompaniment in a monaural musical mixture. The proposed method uses robust principal component analysis (RPCA), followed by postprocessing, including median filter, morphology, and high-pass filter, to decompose the mixture. Subsequently, a deep recurrent neural network comprising two jointly optimized parallel-stacked recurrent neural networks (sRNNs) with mask layers and trained on limited data and computation is applied to the decomposed components to optimize the final estimated separated singing voice and background music to further correct misclassified or residual singing and background music in the initial separation. The experimental results of MIR-1K, ccMixter, and MUSDB18 datasets and the comparison with ten existing techniques indicate that the proposed method achieves competitive performance in monaural singing voice separation. On MUSDB18, the proposed method reaches the comparable separation quality in less training data and lower computational cost compared to the other state-of-the-art technique.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms
    Tachibana, Hideyuki
    Ono, Nobutaka
    Sagayama, Shigeki
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 228 - 237
  • [32] Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection
    Li, Feng
    Akagi, Masato
    NEUROCOMPUTING, 2019, 350 : 44 - 52
  • [34] Singing voice enhancement in monaural music signals based on two-stage harmonic/percussive sound separation on multiple resolution spectrograms
    Tachibana, Hideyuki
    Ono, Nobutaka
    Sagayama, Shigeki
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (01): : 228 - 237
  • [35] ON THE PERCEPTUAL RELEVANCE OF OBJECTIVE SOURCE SEPARATION MEASURES FOR SINGING VOICE SEPARATION
    Gupta, Udit
    Moore, Elliot, II
    Lerch, Alexander
    2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
  • [36] Unsupervised Interpretable Representation Learning for Singing Voice Separation
    Mimilakis, Stylianos, I
    Drossos, Konstantinos
    Schuller, Gerald
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1412 - 1416
  • [37] A Distinct Synthesizer Convolutional TasNet for Singing Voice Separation
    Tian, Congzhou
    Yang, Deshun
    Chen, Xiaoou
    MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 37 - 48
  • [38] Monophonic Singing Voice Separation Based on Deep Learning
    Wang, Yutian
    Zhang, Zhao
    Wang, Zheng
    Cai, JuanJuan
    Wang, Hui
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 491 - 495
  • [39] Exploiting Music Source Separation For Singing Voice Detection
    Bonzi, Francesco
    Mancusi, Michele
    Deo, Simone Del
    Melucci, Pierfrancesco
    Tavella, Maria Stella
    Parisi, Loreto
    Rodola, Emanuele
    IEEE International Workshop on Machine Learning for Signal Processing, MLSP, 2023, 2023-September
  • [40] Singing Voice Separation in Mono-Channel Music
    Chanrungutai, Angkana
    Ratanamahatana, Chotirat Ann
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 256 - 261