Supervised Audio Source Separation Based on Nonnegative Matrix Factorization with Cosine Similarity Penalty

被引:0
作者
Iwase, Yuta [1 ]
Kitamura, Daichi [1 ]
机构
[1] Kagawa Coll, Natl Inst Technol, Takamatsu, Kagawa 7618058, Japan
关键词
audio source separation; nonnegative matrix factorization; orthogonality; cosine similarity; DIVERGENCE;
D O I
10.1587/transfun.2021EAP1149
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we aim to improve the performance of audio source separation for monaural mixture signals. For monaural audio source separation, semisupervised nonnegative matrix factorization (SNMF) can achieve higher separation performance by employing small supervised signals. In particular, penalized SNMF (PSNMF) with orthogonality penalty is an effective method. PSNMF forces two basis matrices for target and nontarget sources to be orthogonal to each other and improves the separation accuracy. However, the conventional orthogonality penalty is based on an inner product and does not affect the estimation of the basis matrix properly because of the scale indeterminacy between the basis and activation matrices in NMF. To cope with this problem, a new PSNMF with cosine similarity between the basis matrices is proposed. The experimental comparison shows the efficacy of the proposed cosine similarity penalty in supervised audio source separation.
引用
收藏
页码:906 / 913
页数:8
相关论文
共 22 条
[1]  
Brunner E, 2000, BIOMETRICAL J, V42, P17, DOI 10.1002/(SICI)1521-4036(200001)42:1<17::AID-BIMJ17>3.0.CO
[2]  
2-U
[3]   Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis [J].
Fevotte, Cedric ;
Bertin, Nancy ;
Durrieu, Jean-Louis .
NEURAL COMPUTATION, 2009, 21 (03) :793-830
[4]  
FitzGerald D., 2009, P IET IR SIGN SYST C
[5]  
Hunter DR, 2000, J COMPUT GRAPH STAT, V9, P60
[6]  
Kameoka H, 2012, INT CONF ACOUST SPEE, P5365, DOI 10.1109/ICASSP.2012.6289133
[7]  
Kitamura D., OPEN DATASET SONGKIT
[8]   Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization [J].
Kitamura, Daichi ;
Ono, Nobutaka ;
Sawada, Hiroshi ;
Kameoka, Hirokazu ;
Saruwatari, Hiroshi .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) :1626-1641
[9]   Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration [J].
Kitamura, Daichi ;
Saruwatari, Hiroshi ;
Kameoka, Hirokazu ;
Takahashi, Yu ;
Kondo, Kazunobu ;
Nakamura, Satoshi .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) :654-669
[10]   Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Orthogonality and Maximum-Divergence Penalties [J].
Kitamura, Daichi ;
Saruwatari, Hiroshi ;
Yagi, Kosuke ;
Shikano, Kiyohiro ;
Takahashi, Yu ;
Kondo, Kazunobu .
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2014, E97A (05) :1113-1118