Multipitch Estimation of Piano Sounds Using a New Probabilistic Spectral Smoothness Principle

被引:186
作者
Emiya, Valentin [1 ,2 ]
Badeau, Roland [2 ]
David, Bertrand [2 ]
机构
[1] INRIA, Ctr Inria Rennes Bretagne Atlantique, F-35042 Rennes, France
[2] CNRS, Inst Telecom, Telecom ParisTech, F-75014 Paris, France
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 06期
关键词
Acoustic signal analysis; audio processing; multipitch estimation (MPE); piano; transcription; spectral smoothness; FUNDAMENTAL-FREQUENCY ESTIMATION; MUSIC; TRANSCRIPTION; MODEL;
D O I
10.1109/TASL.2009.2038819
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new method for the estimation of multiple concurrent pitches in piano recordings is presented. It addresses the issue of overlapping overtones by modeling the spectral envelope of the overtones of each note with a smooth autoregressive model. For the background noise, a moving-average model is used and the combination of both tends to eliminate harmonic and sub-harmonic erroneous pitch estimations. This leads to a complete generative spectral model for simultaneous piano notes, which also explicitly includes the typical deviation from exact harmonicity in a piano overtone series. The pitch set which maximizes an approximate likelihood is selected from among a restricted number of possible pitch combinations as the one. Tests have been conducted on a large homemade database called MAPS, composed of piano recordings from a real upright piano and from high-quality samples.
引用
收藏
页码:1643 / 1654
页数:12
相关论文
共 32 条
[1]  
[Anonymous], 2007, EURASIP J ADV SIG PR
[2]  
[Anonymous], 1964, Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications
[3]  
BADEAU R, 2009, P INT C AUD SPEECH S
[4]   Weighted maximum likelihood autoregressive and moving average spectrum modeling [J].
Badeau, Roland ;
David, Bertrand .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :3761-3764
[5]   Automatic piano transcription using frequency and time-domain information [J].
Bello, Juan P. ;
Daudet, Laurent ;
Sandler, Mark B. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06) :2242-2251
[6]   MUSICAL FUNDAMENTAL-FREQUENCY TRACKING USING A PATTERN-RECOGNITION METHOD [J].
BROWN, JC .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1992, 92 (03) :1394-1402
[7]   A generative model for music transcription [J].
Cemgil, AT ;
Kappen, HJ ;
Barber, D .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02) :679-694
[8]  
Christensen M. G., 2009, Multi-Pitch Estimation
[9]   YIN, a fundamental frequency estimator for speech and music [J].
de Cheveigné, A ;
Kawahara, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930
[10]  
DOVAL B, 1993, P INT C AC SPEECH SI, V1, P221