MULTIMODAL SIMILARITY BETWEEN MUSICAL STREAMS FOR COVER VERSION DETECTION

被引:20
作者
Foucard, Remi [1 ]
Durrieu, Jean-Louis [1 ]
Lagrange, Mathieu [1 ]
Richard, Gael [1 ]
机构
[1] TELECOM ParisTech, CNRS LTCI, Inst TELECOM, F-75014 Paris, France
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
Cover Song Identification; Music Similarity; Main melody extraction; Signal Processing; Music Information Retrieval;
D O I
10.1109/ICASSP.2010.5495217
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Expressing the similarity between musical streams is a challenging task as it involves the understanding of many factors which are most often blended into one information channel: the audio stream. Consequently, separating the musical audio stream into its main melody and its accompaniment may prove as being useful to root the similarity computation on a more robust and expressive representation. In this paper, we show that considering the mixture, an estimation of its main melody and its accompaniment as modalities allows us to propose new ways of defining the similarity between musical streams. In the context of the detection of cover version, we show that highest performance is achieved by jointly considering the mixture and the estimated accompaniment. As demonstrated by the experiments carried out using two different evaluation databases, this scheme allows the scoring system to focus more on the chord progression by considering the accompaniment while being robust to the potential separation errors by also considering the mixture.
引用
收藏
页码:5514 / 5517
页数:4
相关论文
共 13 条
[1]   A comparative evaluation of search techniques for query-by-humming using the MUSART testbed [J].
Dannenberg, Roger B. ;
Birmingham, William P. ;
Pardo, Bryan ;
Hu, Ning ;
Meek, Colin ;
Tzanetakis, George .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (05) :687-701
[2]  
Durrieu J.- L., 2009, P IEEE INT C AC SPEE
[3]  
Ellis DPW, 2007, INT CONF ACOUST SPEE, P1429
[4]  
Ellis Daniel P. W., 2007, 2007 LABROSA COVER S
[5]  
Fujishima T., 1999, P INT COMP MUS C, P464
[6]  
Gomez E., 2006, P AUD ENG SOC CONV A
[7]  
Gutierrez E.G., 2006, Tonal description of music audio signals
[8]  
Marolt Matija, 2006, ISMIR, P280
[9]  
Rabiner L. R., 1993, Fundamentals of Speech Recognition
[10]  
Serra J., 2007, P INT S MUS INF RETR, P319