Measuring Structural Similarity in Music

被引:32
作者
Bello, Juan P. [1 ]
机构
[1] NYU, MARL, New York, NY 10012 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2011年 / 19卷 / 07期
基金
美国国家科学基金会;
关键词
Audio signal processing; computer audition; music information retrieval (MIR); music structure analysis; sound similarity; RECURRENCE PLOTS; CONTACT MAPS; AUDIO; OVERLAP; SEARCH;
D O I
10.1109/TASL.2011.2108287
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for measuring the structural similarity between music recordings. It uses recurrence plot analysis to characterize patterns of repetition in the feature sequence, and the normalized compression distance, a practical approximation of the joint Kolmogorov complexity, to measure the pairwise similarity between the plots. By measuring the distance between intermediate representations of signal structure, the proposed method departs from common approaches to music structure analysis which assume a block-based model of music, and thus concentrate on segmenting and clustering sections. The approach ensures that global structure is consistently and robustly characterized in the presence of tempo, instrumentation, and key changes, while the used metric provides a simple to compute, versatile and robust alternative to common approaches in music similarity research. Finally, experimental results demonstrate success at characterizing similarity, while contributing an optimal parameterization of the proposed approach.
引用
收藏
页码:2013 / 2025
页数:13
相关论文
共 58 条
  • [21] Ellis DPW, 2007, INT CONF ACOUST SPEE, P1429
  • [22] Essid S., 2006, THESIS U P M CURIE P
  • [23] Visualizing music and audio using self-similarity
    Foote, J
    [J]. ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 77 - 80
  • [24] Foote J., 2001, IEEE International Conference on Multimedia and Expo, P881, DOI [10.1109/ICME.2001.1237863, DOI 10.1109/ICME.2001.1237863]
  • [25] Foote J., 2000, P INT C MUS INF RETR
  • [26] Fujishima T., 1999, P INT COMP MUS C, P464
  • [27] Goto M., 2003, P IEEE INT C AC SEEC, pV
  • [28] An experimental comparison of audio tempo induction algorithms
    Gouyon, Fabien
    Klapuri, Anssi
    Dixon, Simon
    Alonso, Miguel
    Tzanetakis, George
    Uhle, Christian
    Cano, Pedro
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1832 - 1844
  • [29] Phase-plane Representation and Visualization of Gestural Structure in Expressive Timing
    Grachten, Maarten
    Goebl, Werner
    Flossmann, Sebastian
    Widmer, Gerhard
    [J]. JOURNAL OF NEW MUSIC RESEARCH, 2009, 38 (02) : 183 - 195
  • [30] Gutierrez E.G., 2006, Tonal description of music audio signals