REpeating Pattern Extraction Technique (REPET): A Simple Method for Music/Voice Separation

被引：146

作者：

Rafii, Zafar ^{[1
]}

Pardo, Bryan ^{[1
]}

机构：

[1] Northwestern Univ, Dept Elect & Comp Engn, Ford Motor Co Engn Design Ctr, Evanston, IL 60208 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2013年 / 21卷 / 01期

基金：

美国国家科学基金会;

关键词：

Melody extraction; music structure analysis; music/voice separation; repeating patterns; SINGING VOICE;

D O I：

10.1109/TASL.2012.2213249

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Repetition is a core principle in music. Many musical pieces are characterized by an underlying repeating structure over which varying elements are superimposed. This is especially true for pop songs where a singer often overlays varying vocals on a repeating accompaniment. On this basis, we present the REpeating Pattern Extraction Technique (REPET), a novel and simple approach for separating the repeating "background" from the non-repeating "foreground" in a mixture. The basic idea is to identify the periodically repeating segments in the audio, compare them to a repeating segment model derived from them, and extract the repeating patterns via time-frequency masking. Experiments on data sets of 1,000 song clips and 14 full-track real-world songs showed that this method can be successfully applied for music/voice separation, competing with two recent state-of-the-art approaches. Further experiments showed that REPET can also be used as a preprocessor to pitch detection algorithms to improve melody extraction.

引用

页码：71 / 82

页数：12

共 36 条

[1]

[Anonymous], P 9 INT C MUS INF RE

[2]

[Anonymous], 2007, P INT S FRONT RES SP

[3]

[Anonymous], 2005, ISMIR

[4] To catch a chorus: Using chroma-based representations for audio thumbnailing [J].

Bartsch, MA ;

Wakefield, GH .

PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, :15-18

[5]

Cooper M., 2002, P 3 INT S MUSIC INFO, P81

[6]

Dannenberg R., 2002, Proceedings of the 2002 International Computer Music Conference, P28

[7]

Dannenberg R. B., 2009, HDB SIGNAL PROCESSIN, V1, P305

[8] Pattern discovery techniques for music audio [J].

Dannenberg, RB ;

Hu, N .

JOURNAL OF NEW MUSIC RESEARCH, 2003, 32 (02) :153-163

[9] YIN, a fundamental frequency estimator for speech and music [J].

de Cheveigné, A ;

Kawahara, H .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930

[10]

Dressler K., 2006, P 7 INT C MUS INF RE

← 1 2 3 4 →