Time-Frequency Analysis as Probabilistic Inference

被引：24

作者：

Turner, Richard E. ^{[1
]}

Sahani, Maneesh ^{[2
]}

机构：

[1] Univ Cambridge, Dept Engn, Cambridge CB2 1TN, England

[2] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England

来源：

IEEE TRANSACTIONS ON SIGNAL PROCESSING | 2014年 / 62卷 / 23期

基金：

英国工程与自然科学研究理事会;

关键词：

Audio signal processing; inference; machine-learning; time-frequency analysis; NONNEGATIVE MATRIX FACTORIZATION; REPRESENTATION; NOISE; EM;

D O I：

10.1109/TSP.2014.2362100

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes a new view of time-frequency analysis framed in terms of probabilistic inference. Natural signals are assumed to be formed by the superposition of distinct time-frequency components, with the analytic goal being to infer these components by application of Bayes' rule. The framework serves to unify various existing models for natural time-series; it relates to both the Wiener and Kalman filters, and with suitable assumptions yields inferential interpretations of the short-time Fourier transform, spectrogram, filter bank, and wavelet representations. Value is gained by placing time-frequency analysis on the same probabilistic basis as is often employed in applications such as denoising, source separation, or recognition. Uncertainty in the time-frequency representation can be propagated correctly to application-specific stages, improving the handing of noise and missing data. Probabilistic learning allows modules to be co-adapted; thus, the time-frequency representation can be adapted to both the demands of the application and the time-varying statistics of the signal at hand. Similarly, the application module can be adapted to fine properties of the signal propagated by the initial time-frequency processing. We demonstrate these benefits by combining probabilistic time-frequency representations with non-negative matrix factorization, finding benefits in audio denoising and inpainting tasks, albeit with higher computational cost than incurred by the standard approach.

引用

页码：6171 / 6183

页数：13

共 54 条

[1]

Achan K., 2003, NEURAL INFORM PROCES, V16, P1393

[2]

[Anonymous], 2006, TOEPLITZ CIRCULANT M

[3]

[Anonymous], IEEE WORKSH APPL SIG

[4]

[Anonymous], 1964, Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications

[5]

Badeau R., 2013, P IEEE INT C AC SPEE

[6]

Badeau R, 2011, 2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), P253, DOI 10.1109/ASPAA.2011.6082264

[7] A SIGNAL-DEPENDENT TIME-FREQUENCY REPRESENTATION - OPTIMAL KERNEL DESIGN [J].

BARANIUK, RG ;

JONES, DL .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (04) :1589-1602

[8] A Structured Model of Video Reproduces Primary Visual Cortical Organisation [J].

Berkes, Pietro ;

Turner, Richard E. ;

Sahani, Maneesh .

PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (09)

[9]

Bertin Nancy, 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), P29, DOI 10.1109/ASPAA.2009.5346531

[10] Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription [J].

Bertin, Nancy ;

Badeau, Roland ;

Vincent, Emmanuel .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :538-549

← 1 2 3 4 5 6 →