Denoising Auto-encoder with Recurrent Skip Connections and Residual Regression for Music Source Separation

被引:27
作者
Liu, Jen-Yu [1 ]
Yang, Yi-Hsuan [1 ]
机构
[1] Acad Sinica, Res Ctr IT Innovat, Taipei, Taiwan
来源
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2018年
关键词
Music source separation; recurrent neural network; skip connections; residual regression;
D O I
10.1109/ICMLA.2018.00123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks with skip connections have shown good performance in music source separation. In this work, we propose a denoising Auto-encoder with Recurrent skip Connections (ARC). We use 1D convolution along the temporal axis of the time-frequency feature map in all layers of the fully-convolutional network. The use of 1D convolution makes it possible to apply recurrent layers to the intermediate outputs of the convolution layers. In addition, we also propose an enhancement network and a residual regression method to further improve the separation result. The recurrent skip connections, the enhancement module, and the residual regression all improve the separation quality. The ARC model with residual regression achieves 5.74 siganl-to-distoration ratio (SDR) in vocals with MUSDB (used in SiSEC 2018). We also evaluate the ARC model alone on the older dataset DSD100 (used in SiSEC 2016) and it achieves 5.91 SDR in vocals.
引用
收藏
页码:773 / 778
页数:6
相关论文
共 27 条
[1]  
Bittner R. M., 2017, ISMIR, P63
[2]  
Cho K., 2014, ARXIV, DOI 10.3115/v1/w14-4012
[3]  
Chou S.-Y., 2018, P INT JOINT C ART IN
[4]  
Chung J., 2014, ARXIV
[5]  
Drossosy K., 2018, ABS180200300 CORR
[6]  
Hung Y.-N., 2018, P INT SOC MUS INF RE
[7]  
Ioffe Sergey, 2015, P MACHINE LEARNING R, V37, P448, DOI [DOI 10.48550/ARXIV.1502.03167, DOI 10.5555/3015118.3045167]
[8]  
Kingma D. P., P 3 INT C LEARN REPR
[9]  
Kumar A., 2017, P 18 INT SOC MUS INF, P745
[10]  
Liu J, 2016, IEEE INT CON MULTI