Audio Watermarking Using Spikegram and a Two-Dictionary Approach

被引：36

作者：

Erfani, Yousof ^{[1
,2
]}

Pichevar, Ramin ^{[1
,3
]}

Rouat, Jean ^{[1
]}

机构：

[1] Univ Sherbrooke, Dept Elect & Comp Engn, NECOTIS Grp, Sherbrooke, PQ J1K 2R1, Canada

[2] McMaster Univ, Auditory Engn Lab, Hamilton, ON L8S 4K1, Canada

[3] Apple Inc, Cupertino, CA 95014 USA

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2017年 / 12卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Copyright protection; watermarking; spikegram; gammatone filter bank; sparse representation; multimedia security; SPREAD-SPECTRUM; UNIFIED SPEECH; ROBUST; SYNCHRONIZATION; ATTACKS; MASKING; SIGNALS; SCHEME;

D O I：

10.1109/TIFS.2016.2636094

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper introduces a new audio watermarking technique based on a perceptual kernel representation of audio signals (spikegram). Spikegram is a recent method to represent audio signals. It is combined with a dictionary of gammatones to construct a robust representation of sounds. In traditional phase embedding methods, the phase of coefficients of a given signal in a specific domain (such as Fourier domain) is modified. In the encoder of the proposed method (two-dictionary approach), the signs and the phases of gammatones in the spikegram are chosen adaptively to maximize the strength of the decoder. Moreover, the watermark is embedded only into kernels with high amplitudes, where all masked gammatones have been already removed. The efficiency of the proposed spikegram watermarking is shown via several experimental results. First, robustness of the proposed method is shown against 32 kb/s MP3 with an embedding rate of 56.5 b/s. Second, we showed that the proposed method is robust against unified speech and audio codec (24-kb/s USAC, linear predictive, and Fourier domain modes) with an average payload of 5-15 b/s. Third, it is robust against simulated small real room attacks with a payload of roughly 1 b/s. Last, it is shown that the proposed method is robust against a variety of signal processing transforms while preserving quality.

引用

页码：840 / 852

页数：13

共 49 条

[1] Audio Inpainting [J].

Adler, Amir ;

Emiya, Valentin ;

Jafari, Maria G. ;

Elad, Michael ;

Gribonval, Remi ;

Plumbley, Mark D. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03) :922-932

[2] IMAGE METHOD FOR EFFICIENTLY SIMULATING SMALL-ROOM ACOUSTICS [J].

ALLEN, JB ;

BERKLEY, DA .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 (04) :943-950

[3]

[Anonymous], 1997, BS11161 ITUR

[4] A Phase-Based Audio Watermarking System Robust to Acoustic Path Propagation [J].

Arnold, Michael ;

Chen, Xiao-Ming ;

Baum, Peter ;

Gries, Ulrich ;

Doerr, Gwenael .

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2014, 9 (03) :411-425

[5] An adaptive audio watermarking based on the singular value decomposition in the wavelet domain [J].

Bhat, Vivekananda K. ;

Sengupta, Indranil ;

Das, Abhijit .

DIGITAL SIGNAL PROCESSING, 2010, 20 (06) :1547-1558

[6] Iterative hard thresholding for compressed sensing [J].

Blumensath, Thomas ;

Davies, Mike E. .

APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2009, 27 (03) :265-274

[7] End-To-End Security for Video Distribution [J].

Boho, Andras ;

Van Wallendael, Glenn ;

Dooms, Ann ;

De Cock, Jan ;

Braeckman, Geert ;

Schelkens, Peter ;

Preneel, Bart ;

Van de Walle, Rik .

IEEE SIGNAL PROCESSING MAGAZINE, 2013, 30 (02) :97-107

[8]

Borwein P., 2008, London Mathematical Society Lecture Note Series, V352, P71

[9] Distributed optimization and statistical learning via the alternating direction method of multipliers [J].

Boyd S. ;

Parikh N. ;

Chu E. ;

Peleato B. ;

Eckstein J. .

Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122

[10] Insertion, deletion codes with feature-based embedding: A new paradigm for watermark synchronization with applications to speech watermarking [J].

Coumou, David J. ;

Sharma, Gaurav .

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2008, 3 (02) :153-165

← 1 2 3 4 5 →