共 22 条
MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation
被引:0
|作者:
Drossos, Konstantinos
[1
]
Mimilakis, Stylianos Ioannis
[2
]
Serdyuk, Dmitriy
[3
]
Schuller, Gerald
[2
]
Virtanen, Tuomas
[1
]
Bengio, Yoshua
[3
]
机构:
[1] Tampere Univ Technol, Tampere, Finland
[2] Tech Univ Ilmenau, Fraunhofer IDMT, Ilmenau, Germany
[3] Univ Montreal, MILA, Montreal, PQ, Canada
基金:
欧盟地平线“2020”;
欧洲研究理事会;
加拿大自然科学与工程研究理事会;
关键词:
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.
引用
收藏
页数:8
相关论文