A Multiple-F0 Estimation Approach Based on Gaussian Spectral Modelling for Polyphonic Music Transcription

被引:8
作者
Canadas Quesada, F. J.
Ruiz Reyes, N. [1 ]
Vera Candeas, P.
Carabias, J. J.
Maldonado, S. [2 ]
机构
[1] Univ Jaen, Telecommun Engn Dept, Polytech Sch, Jaen, Spain
[2] Univ Alcala de Henares, Alcala De Henares 99775, Spain
关键词
SIGNALS; SEPARATION;
D O I
10.1080/09298211003695579
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper proposes a multiple-F0 estimation algorithm for automatic polyphonic music transcription. The proposed algorithm operates at frame level, searching for the set of fundamental frequencies that minimizes a spectral distance measure at each audio frame. The spectral distance measure is defined under the assumption that a polyphonic sound can be modelled by a weighted sum of Gaussian spectral models. Due to the fact that in polyphonic music signals the spectral content at the current audio frame depends to a large extent on the immediately previous ones, the defined spectral distance measure takes into account not only information from the current audio frame but also from some previous ones. An additional performance improvement is achieved by using a Hidden Markov Model (HMM) at the end of the algorithm. The proposed algorithm is tested using real-world polyphonic music recordings taken from the RWC music database. Accuracy rates are reported when our algorithm is performed under different conditions. Classification of the total error into the three categories of errors (substitutions, misses and false alarms) is also reported. Comparison with five recent state-of-the art transcription systems is finally shown.
引用
收藏
页码:93 / 107
页数:15
相关论文
共 24 条
[1]  
ALONSO M, 2005, P IEEE INT C MULT EX
[2]  
[Anonymous], 1997, Statistical methods for speech recognition
[3]  
[Anonymous], P IEEE WORKSH APPL S
[4]  
BELLO J, 2000, INT S MUS INF RETR I
[5]   Automatic piano transcription using frequency and time-domain information [J].
Bello, Juan P. ;
Daudet, Laurent ;
Sandler, Mark B. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06) :2242-2251
[6]   Note-event Detection in Polyphonic Musical Signals based on Harmonic Matching Pursuit and Spectral Smoothness [J].
Canadas-Quesada, F. J. ;
Vera-Candeas, P. ;
Ruiz-Reyes, N. ;
Mata-Campos, R. ;
Carabias-Orti, J. J. .
JOURNAL OF NEW MUSIC RESEARCH, 2008, 37 (03) :167-183
[7]  
EMIYA V, 2008, P EUR C SIGN PROC EU
[8]  
EMIYA V, 2007, P 10 INT C DIG AUD E
[9]   Separation of synchronous pitched notes by spectral filtering of harmonics [J].
Every, Mark R. ;
Szymanski, John E. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1845-1856
[10]   A real-time music-scene-description system: predominant-FO estimation for detecting melody and bass lines in real-world audio signals [J].
Goto, M .
SPEECH COMMUNICATION, 2004, 43 (04) :311-329