Measuring Structural Similarity in Music

被引:32
作者
Bello, Juan P. [1 ]
机构
[1] NYU, MARL, New York, NY 10012 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 07期
基金
美国国家科学基金会;
关键词
Audio signal processing; computer audition; music information retrieval (MIR); music structure analysis; sound similarity; RECURRENCE PLOTS; CONTACT MAPS; AUDIO; OVERLAP; SEARCH;
D O I
10.1109/TASL.2011.2108287
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for measuring the structural similarity between music recordings. It uses recurrence plot analysis to characterize patterns of repetition in the feature sequence, and the normalized compression distance, a practical approximation of the joint Kolmogorov complexity, to measure the pairwise similarity between the plots. By measuring the distance between intermediate representations of signal structure, the proposed method departs from common approaches to music structure analysis which assume a block-based model of music, and thus concentrate on segmenting and clustering sections. The approach ensures that global structure is consistently and robustly characterized in the presence of tempo, instrumentation, and key changes, while the used metric provides a simple to compute, versatile and robust alternative to common approaches in music similarity research. Finally, experimental results demonstrate success at characterizing similarity, while contributing an optimal parameterization of the proposed approach.
引用
收藏
页码:2013 / 2025
页数:13
相关论文
共 58 条
  • [1] Abdallah S., 2005, P INT C MUSIC INFORM, P420
  • [2] Ahonen T., 2008, P INT WORKSH MACH LE
  • [3] [Anonymous], PRACTICAL NONPARAMET
  • [4] [Anonymous], 2006, THESIS VIENNA U TECH
  • [5] [Anonymous], 2002, PROC ISMIR INT SOC M
  • [6] [Anonymous], THESIS U PARIS 6 PAR
  • [7] [Anonymous], 2007, INFORM RETRIEVAL MUS
  • [8] [Anonymous], 2006, Sweet anticipation, DOI DOI 10.7551/MITPRESS/6575.001.0001
  • [9] [Anonymous], P INT C MUS INF RETR
  • [10] Aucouturier JJ, 2002, VIRTUAL, SYNTHETIC, AND ENTERTAINMENT AUDIO, P412