Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization

被引：33

作者：

Carabias-Orti, J. J. ^{[1
]}

Virtanen, T. ^{[2
]}

Vera-Candeas, P. ^{[1
]}

Ruiz-Reyes, N. ^{[1
]}

Canadas-Quesada, F. J. ^{[1
]}

机构：

[1] Univ Jaen, Telecommun Engn Dept, Jaen 23700, Spain

[2] Tampere Univ Technol, Dept Signal Proc, FI-33101 Tampere, Finland

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2011年 / 5卷 / 06期

关键词：

Automatic music transcription; excitation-filter model; excitation modeling; non-negative matrix factorization (NMF); source-filter model; spectral analysis; MATRIX FACTORIZATION; AUTOMATIC TRANSCRIPTION; SOURCE SEPARATION; SMOOTHNESS; MELODY;

D O I：

10.1109/JSTSP.2011.2159700

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper presents theoretical and experimental results about constrained non-negative matrix factorization (NMF) to model the excitation of the musical instruments. These excitations represent vibrating objects, while the filter represents the resonance structure of the instrument, which colors the produced sound. We propose to model the excitations as the weighted sum of harmonically constrained basis functions, whose parameters are tied across different pitches of an instrument. An NMF-based framework is used to learn the model parameters. We assume that the excitations of a well-tempered instrument should possess an identifiable characteristic structure whereas the conditions of the music scene might produce variations in the filter. In order to test the reliability of our proposal, we evaluate our method for a music transcription task in two scenarios. On the first one, comparison with state-of-the-art methods has been performed over a dataset of piano recordings obtaining more accurate results than other NMF-based algorithms. On the second one, two woodwind instrument databases have been used to demonstrate the benefits of our model in comparison with previous excitation-filter model approaches.

引用

页码：1144 / 1158

页数：15

共 46 条

[1]

Abdallah S, 2004, P 5 INT SOC MUS INF

[2]

[Anonymous], P IEEE WORKSH APPL S

[3] EXPECTATION-MAXIMIZATION ALGORITHM FOR MULTI-PITCH ESTIMATION AND SEPARATION OF OVERLAPPING HARMONIC SPECTRA [J].

Badeau, Roland ;

Emiya, Valentin ;

David, Bertrand .

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :3073-3076

[4] Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription [J].

Bertin, Nancy ;

Badeau, Roland ;

Vincent, Emmanuel .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :538-549

[5] Music Scene-Adaptive Harmonic Dictionary for Unsupervised Note-Event Detection [J].

Carabias-Orti, J. J. ;

Vera-Candeas, P. ;

Canadas-Quesada, F. J. ;

Ruiz-Reyes, N. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :473-486

[6] YIN, a fundamental frequency estimator for speech and music [J].

de Cheveigné, A ;

Kawahara, H .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930

[7]

DeLiang WangBrown., 2006, Computational auditory scene analysis: principles, algorithms, and applications

[8] Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals [J].

Durrieu, Jean-Louis ;

Richard, Gael ;

David, Bertrand ;

Fevotte, Cedric .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :564-575

[9]

Eggink J., 2003, P ISMIR, P125

[10]

EMIYA V, 2008, THESIS I TELECOM PAR

← 1 2 3 4 5 →