A Shift-Invariant Latent Variable Model for Automatic Music Transcription

被引:44
作者
Benetos, Emmanouil [1 ]
Dixon, Simon [1 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, Ctr Digital Mus, London E1 4NS, England
关键词
D O I
10.1162/COMJ_a_00146
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.
引用
收藏
页码:81 / 94
页数:14
相关论文
共 28 条
[1]  
[Anonymous], 2007, EURASIP J ADV SIG PR
[2]  
[Anonymous], P 7 SOUND MUS COMP C
[3]  
[Anonymous], 2006, Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
[4]  
Bay M., 2009, P INT SOC MUS INF RE
[5]  
Benetos E., 2011, MUSIC INFORM RETRIEV
[6]   Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription [J].
Benetos, Emmanouil ;
Dixon, Simon .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (06) :1111-1123
[7]  
Benetos Emmanouil., 2011, Sound and Music Computing Conference, P19
[8]   A Multiple-F0 Estimation Approach Based on Gaussian Spectral Modelling for Polyphonic Music Transcription [J].
Canadas Quesada, F. J. ;
Ruiz Reyes, N. ;
Vera Candeas, P. ;
Carabias, J. J. ;
Maldonado, S. .
JOURNAL OF NEW MUSIC RESEARCH, 2010, 39 (01) :93-107
[9]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]  
Dessein A., 2010, P INT SOC MUS INF RE, P489