Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm

被引:36
作者
Paulus, Jouni [1 ]
Klapuri, Anssi [1 ]
机构
[1] Tampere Univ Technol, Dept Signal Proc, FI-33720 Tampere, Finland
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2009年 / 17卷 / 06期
基金
芬兰科学院;
关键词
Acoustic signal analysis; algorithms; modeling; music; search methods; AUDIO;
D O I
10.1109/TASL.2009.2020533
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a method for recovering the sectional form of a musical piece from an acoustic signal. The description of form consists of a segmentation of the piece into musical parts, grouping of the segments representing the same part, and assigning musically meaningful labels, such as "chorus" or "verse," to the groups. The method uses a fitness function for the descriptions to select the one with the highest match with the acoustic properties of the input piece. Different aspects of the input signal are described with three acoustic features: mel-frequency cepstral coefficients, chroma, and rhythmogram. The features are used to estimate the probability that two segments in the description are repeats of each other, and the probabilities are used to determine the total fitness of the description. Creating the candidate descriptions is a combinatorial problem and a novel greedy algorithm constructing descriptions gradually is proposed to solve it. The group labeling utilizes a musicological model consisting of N-grams. The proposed method is evaluated on three data sets of musical pieces with manually annotated ground truth. The evaluations show that the proposed method is able to recover the structural description more accurately than the state-of-the-art reference method.
引用
收藏
页码:1159 / 1170
页数:12
相关论文
共 50 条
[1]  
ABDALLAH S, 2005, P 6 INT C MUS INF RE
[2]  
Abdallah S, 2006, MACH LEARN, V65, P485, DOI [10.1007/s10994-006-0586-4, 10.1007/s 10994-006-0586-4]
[3]  
[Anonymous], INT CONF ACOUST SPEE
[4]  
[Anonymous], 2006, P INT SOC MUSICAL IN
[5]  
[Anonymous], ADV LARGE MARGIN CLA
[6]  
[Anonymous], CUEDFINFENGTR38
[7]  
AUCOUTURIER JJ, 2001, P 110 AUD ENG SOC CO
[8]   Audio thumbnailing of popular music using chroma-based representations [J].
Bartsch, MA ;
Wakefield, GH .
IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (01) :96-104
[9]  
Boutard G., 2006, P 1 WORKSH LEARN SEM, P87
[10]  
Bruderer M., 2006, P INT C MUSIC INFORM, P198