REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

被引:146
作者
Rafii, Zafar [1 ]
Pardo, Bryan [1 ]
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Ford Motor Co Engn Design Ctr, Evanston, IL 60208 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 01期
基金
美国国家科学基金会;
关键词
Melody extraction; music structure analysis; music/voice separation; repeating patterns; SINGING VOICE;
D O I
10.1109/TASL.2012.2213249
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Repetition is a core principle in music. Many musical pieces are characterized by an underlying repeating structure over which varying elements are superimposed. This is especially true for pop songs where a singer often overlays varying vocals on a repeating accompaniment. On this basis, we present the REpeating Pattern Extraction Technique (REPET), a novel and simple approach for separating the repeating "background" from the non-repeating "foreground" in a mixture. The basic idea is to identify the periodically repeating segments in the audio, compare them to a repeating segment model derived from them, and extract the repeating patterns via time-frequency masking. Experiments on data sets of 1,000 song clips and 14 full-track real-world songs showed that this method can be successfully applied for music/voice separation, competing with two recent state-of-the-art approaches. Further experiments showed that REPET can also be used as a preprocessor to pitch detection algorithms to improve melody extraction.
引用
收藏
页码:71 / 82
页数:12
相关论文
共 36 条
[1]  
[Anonymous], P 9 INT C MUS INF RE
[2]  
[Anonymous], 2007, P INT S FRONT RES SP
[3]  
[Anonymous], 2005, ISMIR
[4]   To catch a chorus: Using chroma-based representations for audio thumbnailing [J].
Bartsch, MA ;
Wakefield, GH .
PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :15-18
[5]  
Cooper M., 2002, P 3 INT S MUSIC INFO, P81
[6]  
Dannenberg R., 2002, Proceedings of the 2002 International Computer Music Conference, P28
[7]  
Dannenberg R. B., 2009, HDB SIGNAL PROCESSIN, V1, P305
[8]   Pattern discovery techniques for music audio [J].
Dannenberg, RB ;
Hu, N .
JOURNAL OF NEW MUSIC RESEARCH, 2003, 32 (02) :153-163
[9]   YIN, a fundamental frequency estimator for speech and music [J].
de Cheveigné, A ;
Kawahara, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930
[10]  
Dressler K., 2006, P 7 INT C MUS INF RE