Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds

被引:21
作者
Burred, Juan Jose [1 ]
Roebel, Axel [1 ]
Sikora, Thomas [2 ]
机构
[1] IRCAM CNRS STMS, Anal Synth Team, F-75004 Paris, France
[2] Tech Univ Berlin, Commun Syst Grp, D-10587 Berlin, Germany
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 03期
关键词
Gaussian processes; music information retrieval (MIR); sinusoidal modeling; spectral envelope; timbre model; CLASSIFICATION;
D O I
10.1109/TASL.2009.2036300
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a computational model of musical instrument sounds that focuses on capturing the dynamic behavior of the spectral envelope. A set of spectro-temporal envelopes belonging to different notes of each instrument are extracted by means of sinusoidal modeling and subsequent frequency interpolation, before being subjected to principal component analysis. The prototypical evolution of the envelopes in the obtained reduced-dimensional space is modeled as a nonstationary Gaussian Process. This results in a compact representation in the form of a set of prototype curves in feature space, or equivalently of prototype spectro-temporal envelopes in the time-frequency domain. Finally, the obtained models are successfully evaluated in the context of two music content analysis tasks: classification of instrument samples and detection of instruments in monaural polyphonic mixtures.
引用
收藏
页码:663 / 674
页数:12
相关论文
共 27 条
[1]  
Amatriain X, 2002, DAFX - DIGITAL AUDIO EFFECTS, P373
[2]  
[Anonymous], 1977, ACOUSTICAL FDN MUSIC
[3]  
[Anonymous], P INT C MUS INF RETR
[4]   Feature dependence in the automatic identification of musical woodwind instruments [J].
Brown, JC ;
Houix, O ;
McAdams, S .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 109 (03) :1064-1072
[5]  
Burred J., 2007, P INT C MUS INF RETR
[6]  
Burred J., 2006, P WORKSH LEARN SEM A
[7]  
BURRED JJ, 2010, P IEEE INT C AC SPEE, P173
[8]  
CASEY M, 2002, INTRO MPEG 7
[9]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[10]   Sonological models for timbre characterization [J].
DePoli, G ;
Prandoni, P .
JOURNAL OF NEW MUSIC RESEARCH, 1997, 26 (02) :170-197