Dictionary Learning for Sparse Audio Inpainting

被引:8
作者
Taubock, Georg [1 ]
Rajbamshi, Shristi [1 ]
Balazs, Peter [1 ]
机构
[1] Austrian Acad Sci, Acoust Res Inst, A-1040 Vienna, Austria
关键词
Reliability; Dictionaries; Signal processing algorithms; Machine learning; Time-frequency analysis; Time-domain analysis; Frequency modulation; Audio inpainting; convex; dictionary; frame; Gabor; learning; optimization; sparsity; time-frequency;
D O I
10.1109/JSTSP.2020.3046422
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The objective of audio inpainting is to fill a gap in an audio signal. This is ideally done by reconstructing the original signal or, at least, by inferring a meaningful surrogate signal. We propose a novel approach applying sparse modeling in the time-frequency (TF) domain. In particular, we devise a dictionary learning technique which learns the dictionary from reliable parts around the gap with the goal to obtain a signal representation with increased TF sparsity. This is based on a basis optimization technique to deform a given Gabor frame such that the sparsity of the analysis coefficients of the resulting frame is maximized. Furthermore, we modify the SParse Audio INpainter (SPAIN) for both the analysis and the synthesis model such that it is able to exploit the increased TF sparsity and-in turn-benefits from dictionary learning. Our experiments demonstrate that the developed methods achieve significant gains in terms of signal-to-distortion ratio (SDR) and objective difference grade (ODG) compared with several state-of-the-art audio inpainting techniques.
引用
收藏
页码:104 / 119
页数:16
相关论文
共 62 条
[1]  
Abreu L. D., 2018, ARXIV180802258
[2]   Audio Inpainting [J].
Adler, Amir ;
Emiya, Valentin ;
Jafari, Maria G. ;
Elad, Michael ;
Gribonval, Remi ;
Plumbley, Mark D. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03) :922-932
[3]  
Adler A, 2011, INT CONF ACOUST SPEE, P329
[4]   K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J].
Aharon, Michal ;
Elad, Michael ;
Bruckstein, Alfred .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (11) :4311-4322
[5]  
[Anonymous], 2014, ARXIV14020779
[6]  
[Anonymous], EBU SQAM 400 SOUND Q
[7]  
[Anonymous], CVX: MATLAB software for disciplined convex programming, version 2.1
[8]  
[Anonymous], 2002, APPL DIGITAL SIGNAL, DOI DOI 10.1007/978-1-4471-1561-8
[9]   Audio Soft Declipping Based on Constrained Weighted Least Squares [J].
Avila, Flavio R. ;
Tcheou, Michel P. ;
Biscainho, Luiz W. P. .
IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (09) :1348-1352
[10]   Self-content-based audio inpainting [J].
Bahat, Yuval ;
Schechner, Yoav Y. ;
Elad, Michael .
SIGNAL PROCESSING, 2015, 111 :61-72