Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model

被引:39
作者
Benetos, Emmanouil [1 ]
Dixon, Simon [1 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, Ctr Digital Mus, London E1 4NS, England
关键词
FUNDAMENTAL-FREQUENCY ESTIMATION; TUTORIAL;
D O I
10.1121/1.4790351
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A method for automatic transcription of polyphonic music is proposed in this work that models the temporal evolution of musical tones. The model extends the shift-invariant probabilistic latent component analysis method by supporting the use of spectral templates that correspond to sound states such as attack, sustain, and decay. The order of these templates is controlled using hidden Markov model-based temporal constraints. In addition, the model can exploit multiple templates per pitch and instrument source. The shift-invariant aspect of the model makes it suitable for music signals that exhibit frequency modulations or tuning changes. Pitch-wise hidden Markov models are also utilized in a postprocessing step for note tracking. For training, sound state templates were extracted for various orchestral instruments using isolated note samples. The proposed transcription system was tested on multiple-instrument recordings from various datasets. Experimental results show that the proposed model is superior to a non-temporally constrained model and also outperforms various state-of-the-art transcription systems for the same experiment. (C) 2013 Acoustical Society of America. [http://dx.doi.org/10.1121/1.4790351]
引用
收藏
页码:1727 / 1741
页数:15
相关论文
共 43 条
  • [1] [Anonymous], 2007, EURASIP J ADV SIG PR
  • [2] [Anonymous], 2006, Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
  • [3] Bay M., 2009, P INT SOC MUS INF RE
  • [4] A tutorial on onset detection in music signals
    Bello, JP
    Daudet, L
    Abdallah, S
    Duxbury, C
    Davies, M
    Sandler, MB
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 1035 - 1047
  • [5] Benetos Emmanouil, 2012, Latent Variable Analysis and Signal Separation. Proceedings 10th International Conference, LVA/ICA 2012, P364, DOI 10.1007/978-3-642-28551-6_45
  • [6] BENETOS E, 2011, WORKSH APPL SIGN PRO, P133
  • [7] Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription
    Benetos, Emmanouil
    Dixon, Simon
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (06) : 1111 - 1123
  • [8] Benetos Emmanouil., 2011, Sound and Music Computing Conference, P19
  • [9] CALCULATION OF A CONSTANT-Q SPECTRAL TRANSFORM
    BROWN, JC
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1991, 89 (01) : 425 - 434
  • [10] A Multiple-F0 Estimation Approach Based on Gaussian Spectral Modelling for Polyphonic Music Transcription
    Canadas Quesada, F. J.
    Ruiz Reyes, N.
    Vera Candeas, P.
    Carabias, J. J.
    Maldonado, S.
    [J]. JOURNAL OF NEW MUSIC RESEARCH, 2010, 39 (01) : 93 - 107