RPCA-DRNN technique for monaural singing voice separation

被引：8

作者：

Lai, Wen-Hsing ^{[1
]}

Wang, Siou-Lin ^{[2
]}

机构：

[1] Natl Kaohsiung Univ Sci & Technol, Dept Comp & Commun Engn, Kaohsiung 824005, Taiwan

[2] Natl Kaohsiung Univ Sci & Technol, Coll Engn, PhD Program Engn Sci & Technol, Kaohsiung 824005, Taiwan

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2022年 / 2022卷 / 01期

关键词：

Singing separation; Robust principal component analysis; Deep recurrent neural network; Stacked recurrent neural network; FACTORIZATION; OPTIMIZATION; ENHANCEMENT; CONTINUITY; MODELS;

D O I：

10.1186/s13636-022-00236-9

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this study, we propose a methodology for separating a singing voice from musical accompaniment in a monaural musical mixture. The proposed method uses robust principal component analysis (RPCA), followed by postprocessing, including median filter, morphology, and high-pass filter, to decompose the mixture. Subsequently, a deep recurrent neural network comprising two jointly optimized parallel-stacked recurrent neural networks (sRNNs) with mask layers and trained on limited data and computation is applied to the decomposed components to optimize the final estimated separated singing voice and background music to further correct misclassified or residual singing and background music in the initial separation. The experimental results of MIR-1K, ccMixter, and MUSDB18 datasets and the comparison with ten existing techniques indicate that the proposed method achieves competitive performance in monaural singing voice separation. On MUSDB18, the proposed method reaches the comparable separation quality in less training data and lower computational cost compared to the other state-of-the-art technique.

引用

页数：21

共 50 条

[31] Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms
Tachibana, Hideyuki
Ono, Nobutaka
Sagayama, Shigeki
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 228 - 237
[32] Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection
Li, Feng
Akagi, Masato
NEUROCOMPUTING, 2019, 350 : 44 - 52
[33] Singing voice enhancement in monaural music signals based on two-stage harmonic/percussive sound separation on multiple resolution spectrograms
1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
[34] Singing voice enhancement in monaural music signals based on two-stage harmonic/percussive sound separation on multiple resolution spectrograms
Tachibana, Hideyuki
Ono, Nobutaka
Sagayama, Shigeki
IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (01): : 228 - 237
[35] ON THE PERCEPTUAL RELEVANCE OF OBJECTIVE SOURCE SEPARATION MEASURES FOR SINGING VOICE SEPARATION
Gupta, Udit
Moore, Elliot, II
Lerch, Alexander
2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
[36] Unsupervised Interpretable Representation Learning for Singing Voice Separation
Mimilakis, Stylianos, I
Drossos, Konstantinos
Schuller, Gerald
28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1412 - 1416
[37] A Distinct Synthesizer Convolutional TasNet for Singing Voice Separation
Tian, Congzhou
Yang, Deshun
Chen, Xiaoou
MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 37 - 48
[38] Monophonic Singing Voice Separation Based on Deep Learning
Wang, Yutian
Zhang, Zhao
Wang, Zheng
Cai, JuanJuan
Wang, Hui
2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 491 - 495
[39] Exploiting Music Source Separation For Singing Voice Detection
Bonzi, Francesco
Mancusi, Michele
Deo, Simone Del
Melucci, Pierfrancesco
Tavella, Maria Stella
Parisi, Loreto
Rodola, Emanuele
IEEE International Workshop on Machine Learning for Signal Processing, MLSP, 2023, 2023-September
[40] Singing Voice Separation in Mono-Channel Music
Chanrungutai, Angkana
Ratanamahatana, Chotirat Ann
2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 256 - 261

← 1 2 3 4 5 →