Audio Watermarking Using Spikegram and a Two-Dictionary Approach

被引：36

作者：

Erfani, Yousof ^{[1
,2
]}

Pichevar, Ramin ^{[1
,3
]}

Rouat, Jean ^{[1
]}

机构：

[1] Univ Sherbrooke, Dept Elect & Comp Engn, NECOTIS Grp, Sherbrooke, PQ J1K 2R1, Canada

[2] McMaster Univ, Auditory Engn Lab, Hamilton, ON L8S 4K1, Canada

[3] Apple Inc, Cupertino, CA 95014 USA

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2017年 / 12卷 / 04期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Copyright protection; watermarking; spikegram; gammatone filter bank; sparse representation; multimedia security; SPREAD-SPECTRUM; UNIFIED SPEECH; ROBUST; SYNCHRONIZATION; ATTACKS; MASKING; SIGNALS; SCHEME;

D O I：

10.1109/TIFS.2016.2636094

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper introduces a new audio watermarking technique based on a perceptual kernel representation of audio signals (spikegram). Spikegram is a recent method to represent audio signals. It is combined with a dictionary of gammatones to construct a robust representation of sounds. In traditional phase embedding methods, the phase of coefficients of a given signal in a specific domain (such as Fourier domain) is modified. In the encoder of the proposed method (two-dictionary approach), the signs and the phases of gammatones in the spikegram are chosen adaptively to maximize the strength of the decoder. Moreover, the watermark is embedded only into kernels with high amplitudes, where all masked gammatones have been already removed. The efficiency of the proposed spikegram watermarking is shown via several experimental results. First, robustness of the proposed method is shown against 32 kb/s MP3 with an embedding rate of 56.5 b/s. Second, we showed that the proposed method is robust against unified speech and audio codec (24-kb/s USAC, linear predictive, and Fourier domain modes) with an average payload of 5-15 b/s. Third, it is robust against simulated small real room attacks with a payload of roughly 1 b/s. Last, it is shown that the proposed method is robust against a variety of signal processing transforms while preserving quality.

引用

页码：840 / 852

页数：13

共 49 条

[11]

Cox I, 2007, Digital watermarking and steganography

[12]

Fevotte C., 2006, P INT C AC SPEECH SI, P57

[13]

Hua G, 2016, IEEE ASME INT C ADV, P1047, DOI 10.1109/AIM.2016.7576908

[14] Cepstral Analysis for the Application of Echo-Based Audio Watermark Detection [J].

Hua, Guang ;

Goh, Jonathan ;

Thing, Vrizlynn L. L. .

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (09) :1850-1861

[15] Time-Spread Echo-Based Audio Watermarking With Optimized Imperceptibility and Robustness [J].

Hua, Guang ;

Goh, Jonathan ;

Thing, Vrizlynn. L. L. .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) :227-239

[16]

Kabal P., 2002, BS1387 ITUR

[17] Audio Watermarking Via EMD [J].

Khaldi, Kais ;

Boudraa, Abdel-Ouahab .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03) :675-680

[18] Spread-spectrum watermarking of audio signals [J].

Kirovski, D ;

Malvar, HS .

IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2003, 51 (04) :1020-1033

[19]

Klein A., 2013, Stream Ciphers

[20]

Lehmann EA, 2007, P IEEE WORKSH APPL S, P159, DOI [DOI 10.1109/ASPAA.2007.4392980, mu10.1109/aspaa.2007.4392980]

← 1 2 3 4 5 →