Time-Frequency Analysis as Probabilistic Inference

被引:24
作者
Turner, Richard E. [1 ]
Sahani, Maneesh [2 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1TN, England
[2] UCL, Gatsby Computat Neurosci Unit, London WC1N 3AR, England
基金
英国工程与自然科学研究理事会;
关键词
Audio signal processing; inference; machine-learning; time-frequency analysis; NONNEGATIVE MATRIX FACTORIZATION; REPRESENTATION; NOISE; EM;
D O I
10.1109/TSP.2014.2362100
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new view of time-frequency analysis framed in terms of probabilistic inference. Natural signals are assumed to be formed by the superposition of distinct time-frequency components, with the analytic goal being to infer these components by application of Bayes' rule. The framework serves to unify various existing models for natural time-series; it relates to both the Wiener and Kalman filters, and with suitable assumptions yields inferential interpretations of the short-time Fourier transform, spectrogram, filter bank, and wavelet representations. Value is gained by placing time-frequency analysis on the same probabilistic basis as is often employed in applications such as denoising, source separation, or recognition. Uncertainty in the time-frequency representation can be propagated correctly to application-specific stages, improving the handing of noise and missing data. Probabilistic learning allows modules to be co-adapted; thus, the time-frequency representation can be adapted to both the demands of the application and the time-varying statistics of the signal at hand. Similarly, the application module can be adapted to fine properties of the signal propagated by the initial time-frequency processing. We demonstrate these benefits by combining probabilistic time-frequency representations with non-negative matrix factorization, finding benefits in audio denoising and inpainting tasks, albeit with higher computational cost than incurred by the standard approach.
引用
收藏
页码:6171 / 6183
页数:13
相关论文
共 54 条
[1]  
Achan K., 2003, NEURAL INFORM PROCES, V16, P1393
[2]  
[Anonymous], 2006, TOEPLITZ CIRCULANT M
[3]  
[Anonymous], IEEE WORKSH APPL SIG
[4]  
[Anonymous], 1964, Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications
[5]  
Badeau R., 2013, P IEEE INT C AC SPEE
[6]  
Badeau R, 2011, 2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), P253, DOI 10.1109/ASPAA.2011.6082264
[7]   A SIGNAL-DEPENDENT TIME-FREQUENCY REPRESENTATION - OPTIMAL KERNEL DESIGN [J].
BARANIUK, RG ;
JONES, DL .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1993, 41 (04) :1589-1602
[8]   A Structured Model of Video Reproduces Primary Visual Cortical Organisation [J].
Berkes, Pietro ;
Turner, Richard E. ;
Sahani, Maneesh .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (09)
[9]  
Bertin Nancy, 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), P29, DOI 10.1109/ASPAA.2009.5346531
[10]   Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription [J].
Bertin, Nancy ;
Badeau, Roland ;
Vincent, Emmanuel .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :538-549