Unsupervised Single-Channel Singing Voice Separation with Weighted Robust Principal Component Analysis Based on Gammatone Auditory Filterbank and Vocal Activity Detection

被引:0
|
作者
Li, Feng [1 ,2 ]
Hu, Yujun [1 ]
Wang, Lingling [1 ]
机构
[1] Anhui Univ Finance & Econ, Dept Comp Sci & Technol, Bengbu 233030, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230026, Peoples R China
基金
中国国家自然科学基金;
关键词
single channel; singing voice; source separation; robust principal component analysis; gammatone filterbank; vocal activity detection; POLYPHONIC MUSIC; LYRICS ALIGNMENT; SPEECH; NETWORK;
D O I
10.3390/s23063015
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Singing-voice separation is a separation task that involves a singing voice and musical accompaniment. In this paper, we propose a novel, unsupervised methodology for extracting a singing voice from the background in a musical mixture. This method is a modification of robust principal component analysis (RPCA) that separates a singing voice by using weighting based on gammatone filterbank and vocal activity detection. Although RPCA is a helpful method for separating voices from the music mixture, it fails when one single value, such as drums, is much larger than others (e.g., the accompanying instruments). As a result, the proposed approach takes advantage of varying values between low-rank (background) and sparse matrices (singing voice). Additionally, we propose an expanded RPCA on the cochleagram by utilizing coalescent masking on the gammatone. Finally, we utilize vocal activity detection to enhance the separation outcomes by eliminating the lingering music signal. Evaluation results reveal that the proposed approach provides superior separation outcomes than RPCA on ccMixter and DSD100 datasets.
引用
收藏
页数:17
相关论文
共 21 条
  • [1] Weighted Robust Principal Component Analysis with Gammatone Auditory Filterbank for Singing Voice Separation
    Li, Feng
    Akagi, Masato
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT VI, 2017, 10639 : 849 - 858
  • [2] Unsupervised Singing Voice Separation Using Gammatone Auditory Filterbank and Constraint Robust Principal Component Analysis
    Li, Feng
    Akagi, Masato
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1924 - 1928
  • [3] Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection
    Li, Feng
    Akagi, Masato
    NEUROCOMPUTING, 2019, 350 : 44 - 52
  • [4] Unsupervised Singing Voice Separation Based on Robust Principal Component Analysis Exploiting Rank-1 Constraint
    Li, Feng
    Akagi, Masato
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1920 - 1924
  • [5] New method based on single-channel separation algorithm using Gammatone filterbank for cochlear implants
    Essaid, Billel
    Batel, Noureddine
    PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON APPLIED SMART SYSTEMS (ICASS), 2018,
  • [6] Singing Voice Separation and Vocal FO Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation
    Ikemiya, Yukara
    Itoyama, Katsutoshi
    Yoshii, Kazuyoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2084 - 2095
  • [7] Music/Singing Voice Separation Based on Repeating Pattern Extraction Technique and Robust Principal Component Analysis
    Dogan, Sait Melih
    Salor, Ozgul
    2018 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONIC ENGINEERING (ICEEE), 2018, : 482 - 487
  • [8] Unsupervised Singing Voice Separation from Music Accompaniment Using Robust Principal Componenet Analysis
    Umap, Priyanka. K.
    Chaudhari, Kirti. B.
    Joshi, Madhuri A.
    2015 INTERNATIONAL CONFERENCE ON INDUSTRIAL INSTRUMENTATION AND CONTROL (ICIC), 2015, : 1433 - 1436
  • [9] Unsupervised Single-Channel Separation of Nonstationary Signals Using Gammatone Filterbank and Itakura-Saito Nonnegative Matrix Two-Dimensional Factorizations
    Gao, Bin
    Woo, W. L.
    Dlay, S. S.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2013, 60 (03) : 662 - 675
  • [10] SINGING-VOICE SEPARATION FROM MONAURAL RECORDINGS USING ROBUST PRINCIPAL COMPONENT ANALYSIS
    Huang, Po-Sen
    Chen, Scott Deeann
    Smaragdis, Paris
    Hasegawa-Johnson, Mark
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 57 - 60